Hi everyone,
I'm looking for graph database for my project and OrientDB is on the
shortlist. My benchmark task is to import and query data about accounts and
financial transactions. It's a quite large dataset - 1M accounts and 50M
transactions. The dataset is CSV file and I have latest OrientDB 2.0.10.
My schema: two CSV files
ACCOUNT - account_key -> PK, account_owner
TRANSACTION - account_from_key, account_to_key, amount, date
I want to create multigraph -> accounts are vertexes and transactions are
edges. But I failed to configure oetl in this way. I searched through web,
but I didn't find any useful example -> how to import just edges...
I succeeded with configuration below, but I had to create vertex for every
transaction. Is there any possibility to import just edges?
Next question is - my HW configuration is i7, 8GB RAM, 7200 HDD - if I
import Accounts, I have cca 7000 vertexes/sec, if I import Transactions, I
have just 500 vertexes/sec. Is it normal, or I have performance fail in my
configuration?
And last observation - when I import data with oetl, then I open web
browser - 127.0.0.1:2480 - orientDB Studio - then I'm unable to do another
import -> database is locked :(
account.json
> "source": { "file": { "path": "accounts.csv", "lock": false } },
>>
> "extractor": { "row": {} },
>
> "transformers": [
>
> { "csv":
>
> {
>
> "separator": ",",
>
> "columnsOnFirstLine": false,
>
> "columns":
>
> ["account_key","account_owner"]
>
> }
>
> },
>
> { "vertex": { "class": "Account" } }
>
> ],
>
> "loader": {
>
> "orientdb": {
>
> "dbURL": "plocal:../databases/transactions",
>
> "dbType": "graph",
>
> "classes": [
>
> {"name": "Account", "extends": "V"},
>
> {"name": "Transaction", "extends": "V"}
>
> ],
>
> "indexes": [
>
> {"class":"Account", "fields":["account_key:integer"],
>> "type":"UNIQUE" },
>
> {"class":"Transaction", "fields":["account_from_key:integer"],
>> "type":"NOTUNIQUE" },
>
> {"class":"Transaction", "fields":["account_to_key:integer"],
>> "type":"NOTUNIQUE" }
>
> ]
>
> }
>
> }
>
> }
>
>
transaction.json
> "source": { "file": { "path": "transactions.csv", "lock": false} },
"extractor": { "row": {} },
"transformers": [
{ "csv": {"separator": ",",
"columnsOnFirstLine": false,
"columns":
["account_from_key","account_to_key","amount"]
}
},
{ "vertex": { "class": "Transaction" } },
{ "edge": { "class": "OutTransaction",
"joinFieldName": "account_from_key",
"lookup": "Account.account_key",
"direction": "out"
}
},
{
"edge": { "class": "InTransaction",
"joinFieldName": "account_to_key",
"lookup": "Account.account_key",
"direction": "in"
}
}
],
"loader": {
"orientdb": {
"dbURL": "plocal:../databases/transactions",
"dbType": "graph",
"dbAutoCreate": true,
"tx": false,
"batchCommit": 1000,
"classes": [
{"name": "Account", "extends": "V"},
{"name": "Transaction", "extends": "V"},
{"name": "OutTransaction", "extends": "E"},
{"name": "InTransaction", "extends": "E"}
],
"indexes": [
{"class":"Account", "fields":["account_key:integer"],
> "type":"UNIQUE" },
{"class":"Transaction", "fields":["account_from_key:integer"],
> "type":"NOTUNIQUE" },
{"class":"Transaction", "fields":["account_to_key:integer"],
> "type":"NOTUNIQUE" }
]
}
}
}
Thanks for any advice
Marek
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.