[orientdb] Re: Load csv file via ETL

Sarav Tue, 09 Sep 2014 03:58:37 -0700

You are right, the etl tool version I have is differnet (based on pom file, 
it is 0.9.2-SNAPSHOT), but I didn't find the 1.7.8 version of the ETL tool 
though. Can you share the location of the 1.7.8 ETL please.


On Monday, 8 September 2014 19:41:07 UTC+8, Curtis Mosters wrote:
>
> I think an important notice is also which OrientDB and OrientDB-ETL 
> version you are using =)
>
> Am Montag, 8. September 2014 10:51:31 UTC+2 schrieb Sarav:
>>
>> This is the file I used, please check what is wrong. Thanks.
>>
>>
>>
>> On Tuesday, 26 August 2014 23:20:01 UTC+8, Curtis Mosters wrote:
>>>
>>> Well instead of using a JDBC connection to MySQL I instead now want to 
>>> use a *CSV *file to load from.
>>>
>>> So I tried out many things and again ETL is having huge problems. Well 
>>> it still seems pretty unstable and not well documented. The only tutorial 
>>> in the internet is the one about the DBPedia. Well I don't know that 
>>> plattform this is but for sure it's the same way of an usual CSV like:
>>>
>>> id,name
>>> 1,Name1
>>> 2,Name2
>>> and so on
>>>
>>> So I tried it with an own example:
>>>
>>> {
>>>   "config": {
>>>     "verbose": true,
>>>     "fileDirectory": "C:/Users/kwoxer/Desktop/DB - 
>>> orientdb/bin/backup/csv-etl/",
>>>     "fileName": "Person.csv.gz"
>>>   },
>>>   "begin": [
>>>    { "let": { "name": "$filePath",  "value": "$fileDirectory.append( 
>>> $fileName )"} },
>>>    { "let": { "name": "$className", "value": "$fileName.substring( 0, 
>>> $fileName.indexOf(".") )"} }
>>>   ],
>>>   "source" : {
>>>     "file": { "path": "$filePath", "lock" : false }
>>>   },
>>>   "extractor" : {
>>>     "row": {}
>>>   },
>>>   "transformers" : [
>>>    { "csv": { "separator": ",", "nullValue": "NULL", "skipFrom": 2, 
>>> "skipTo": 2 } },
>>>    { "vertex": { "class": "$className"} }
>>>   ],
>>>   "loader" : {
>>>     "orientdb": {
>>>       "dbURL": "plocal:C:\Users\kwoxer\Desktop\DB - 
>>> orientdb\databases\Test",
>>>       "dbUser": "root",
>>>       "dbPassword": "root",
>>>       "dbAutoCreate": true,
>>>       "tx": false,
>>>       "batchCommit": 1000,
>>>       "dbType": "graph",
>>>       "indexes": [{"class":"V", "fields":["id:string"], "type":"UNIQUE" 
>>> }]
>>>     }
>>>   }
>>> }
>>>
>>> But when I run this I get:
>>>
>>> C:\Users\kwoxer\Desktop\DB - orientdb\bin>oetl.bat backup\csv-etl\person
>>> .json
>>> OrientDB etl v.1.7.8 (build @BUILD@) www.orientechnologies.com
>>> BEGIN ETL PROCESSOR
>>>
>>> 2014-08-26 17:08:26:501 WARN Transformer [com.orientechnologies.orient.
>>> etl.trans
>>> former.OCSVTransformer@107598d7] returned null, skip rest of pipeline 
>>> execution
>>> [OETLPipeline]END ETL PROCESSOR
>>> + extracted 1 rows (0 rows/sec) - 1 rows -> loaded 0 vertices (0 
>>> vertices/sec) T
>>> otal time: 35ms [0 warnings, 0 errors]
>>>
>>> Some Transformer Warning, well due there is no real example about a 
>>> normal CSV I cannot do anything. I also don't understand why the "skip"'s 
>>> are mandatory. Why should I skip line in a CSV? Is this just for DBPedia 
>>> where comments might happen?
>>>
>>> Could some please add more examples. I just want to import a CSV with 
>>> ETL nothing else. Thanks.
>>>
>>> BTW: I also tried the unzipped version:
>>>
>>> "fileName": "Person.csv"
>>>
>>> Same result...:
>>>
>>> C:\Users\kwoxer\Desktop\DB - orientdb\bin>oetl.bat backup\csv-etl\person
>>> .json
>>> OrientDB etl v.1.7.8 (build @BUILD@) www.orientechnologies.com
>>> BEGIN ETL PROCESSOR
>>>
>>> 2014-08-26 17:18:21:189 WARN Transformer [com.orientechnologies.orient.
>>> etl.trans
>>> former.OCSVTransformer@1747c] returned null, skip rest of pipeline 
>>> execution [OE
>>> TLPipeline]END ETL PROCESSOR
>>> + extracted 1 rows (0 rows/sec) - 1 rows -> loaded 0 vertices (0 
>>> vertices/sec) T
>>> otal time: 25ms [0 warnings, 0 errors]
>>>
>>>
>>>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[orientdb] Re: Load csv file via ETL

Reply via email to