in my suggestion , you must use java api, it work 100%

чт, 8 марта 2018 г. в 10:32, Kunal Goyal <[email protected]>:

> HI Lars,
>
> did you get the solution for this problme ??
>
> On Thursday, January 22, 2015 at 12:36:15 AM UTC+5:30, Lars Plessmann
> wrote:
>>
>> I have a really huge CSV (about 240GB) file with several columns (lets
>> say there are columns A - H).
>> The first column A is the primary key of the main record (vertex
>> MainRecord). But the columns D, E, F, G are columns which should be stored
>> in an own vertex (because these fields are redundant over all the records
>> and I dont want to store them in the main record again and again). So the
>> column value of D-G itself should be stored as a property called "title" in
>> a new vertex (but it should not generate duplicates). Afterwards these
>> vertexes needs to be linked.
>> Is this possible to reach this with an single orient-etl configuration? I
>> think the only way I know is to split the huge csv file's columns and
>> create sepperate files for each vertex. But I dont want to do this if that
>> is not neccessairy (file is so big).
>> I hope you can give me an advice?
>>
>> I try to describe it in the config json syntax what I need (of course,
>> this will not work):
>>
>> {
>>   "source": {
>>     "file": {
>>       "path": "dataexport.csv"
>>     }
>>   },
>>   "extractor": {"row": {}},
>>   "transformers": [
>>     {
>>       "csv": {
>>         "separator": ",",
>>         "nullValue": "NULL",
>>         "skipFrom": -1,
>>         "skipTo": -1
>>       }
>>     },
>>     {
>>       "field": {
>>         "fieldName": "_id",
>>         "expression": "$input._id.substring(9, 33)"
>>       }
>>     },
>>     {
>>       "field": {
>>         "fieldName": "colD",
>>         "class": "ColumnD",
>>         "classProperty": "title"
>>       }
>>     },
>>     {
>>       "field": {
>>         "fieldName": "colE",
>>         "class": "ColumnE",
>>         "classProperty": "title"
>>       }
>>     },
>>     {
>>       "field": {
>>         "fieldName": "colF",
>>         "class": "ColumnF",
>>         "classProperty": "title"
>>     }    {
>>       "field": {
>>         "fieldName": "colG",
>>         "class": "ColumnG",
>>         "classProperty": "title"
>>       }
>>     }
>>     },
>>     {
>>       "vertex": {"class": "MainRecord"}
>>     }
>>   ],
>>   "loader": {
>>     "orientdb": {
>>       "dbURL": "remote:127.0.0.1/msales_testing",
>>       "dbUser": "admin",
>>       "dbPassword": "admin",
>>       "dbAutoCreate": true,
>>       "dbType": "graph",
>>       "classes": [
>>         {
>>           "name": "MainRecord",
>>           "extends": "V"
>>         },
>>         {
>>           "name": "ColumnD",
>>           "extends": "V"
>>         },
>>         {
>>           "name": "ColumnE",
>>           "extends": "V"
>>         },
>>         {
>>           "name": "ColumnF",
>>           "extends": "V"
>>         },
>>         {
>>           "name": "ColumnG",
>>           "extends": "V"
>>         }
>>       ],
>>       "indexes": [
>>         {
>>           "class": "MainRecord",
>>           "fields": ["_id:string"],
>>           "type": "UNIQUE"
>>         }
>>       ]
>>     }
>>   }
>> }
>>
>>
>>
>> By the way: _id is in the MongoDB ObjectID format. I just want to store
>> the original hex value, so I used the substring sql method to extract the
>> hex id. Maybe there is a better way.
>>
>>
>> regards
>> Lars
>>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to