in my suggestion , you must use java api, it work 100%
чт, 8 марта 2018 г. в 10:32, Kunal Goyal <[email protected]>:
> HI Lars,
>
> did you get the solution for this problme ??
>
> On Thursday, January 22, 2015 at 12:36:15 AM UTC+5:30, Lars Plessmann
> wrote:
>>
>> I have a really huge CSV (about 240GB) file with several columns (lets
>> say there are columns A - H).
>> The first column A is the primary key of the main record (vertex
>> MainRecord). But the columns D, E, F, G are columns which should be stored
>> in an own vertex (because these fields are redundant over all the records
>> and I dont want to store them in the main record again and again). So the
>> column value of D-G itself should be stored as a property called "title" in
>> a new vertex (but it should not generate duplicates). Afterwards these
>> vertexes needs to be linked.
>> Is this possible to reach this with an single orient-etl configuration? I
>> think the only way I know is to split the huge csv file's columns and
>> create sepperate files for each vertex. But I dont want to do this if that
>> is not neccessairy (file is so big).
>> I hope you can give me an advice?
>>
>> I try to describe it in the config json syntax what I need (of course,
>> this will not work):
>>
>> {
>> "source": {
>> "file": {
>> "path": "dataexport.csv"
>> }
>> },
>> "extractor": {"row": {}},
>> "transformers": [
>> {
>> "csv": {
>> "separator": ",",
>> "nullValue": "NULL",
>> "skipFrom": -1,
>> "skipTo": -1
>> }
>> },
>> {
>> "field": {
>> "fieldName": "_id",
>> "expression": "$input._id.substring(9, 33)"
>> }
>> },
>> {
>> "field": {
>> "fieldName": "colD",
>> "class": "ColumnD",
>> "classProperty": "title"
>> }
>> },
>> {
>> "field": {
>> "fieldName": "colE",
>> "class": "ColumnE",
>> "classProperty": "title"
>> }
>> },
>> {
>> "field": {
>> "fieldName": "colF",
>> "class": "ColumnF",
>> "classProperty": "title"
>> } {
>> "field": {
>> "fieldName": "colG",
>> "class": "ColumnG",
>> "classProperty": "title"
>> }
>> }
>> },
>> {
>> "vertex": {"class": "MainRecord"}
>> }
>> ],
>> "loader": {
>> "orientdb": {
>> "dbURL": "remote:127.0.0.1/msales_testing",
>> "dbUser": "admin",
>> "dbPassword": "admin",
>> "dbAutoCreate": true,
>> "dbType": "graph",
>> "classes": [
>> {
>> "name": "MainRecord",
>> "extends": "V"
>> },
>> {
>> "name": "ColumnD",
>> "extends": "V"
>> },
>> {
>> "name": "ColumnE",
>> "extends": "V"
>> },
>> {
>> "name": "ColumnF",
>> "extends": "V"
>> },
>> {
>> "name": "ColumnG",
>> "extends": "V"
>> }
>> ],
>> "indexes": [
>> {
>> "class": "MainRecord",
>> "fields": ["_id:string"],
>> "type": "UNIQUE"
>> }
>> ]
>> }
>> }
>> }
>>
>>
>>
>> By the way: _id is in the MongoDB ObjectID format. I just want to store
>> the original hex value, so I used the substring sql method to extract the
>> hex id. Maybe there is a better way.
>>
>>
>> regards
>> Lars
>>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.