I have a problem where I need to be able to generate edges between nodes using 2 or more join fields to properly resolve the match.
It's similar to this question on stack overflow <http://stackoverflow.com/questions/39517796/orientdb-etl-edge-transformer-2-joinfieldnames?noredirect=1&lq=1>... the solution in that problem is to add multiple joinFieldName entries into the edge transformer, but this isn't quite working as expected when I tried it out... If I change the data by appending a new row, 2,1 to each data files to get this: data1.csv a1,a2 1,1 1,2 2,3 2,1 data2.csv b1,b2 1,1 2,3 1,2 2,1 then using the json provided: data1.json { "source": { "file": { "path": "./data1.csv" } }, "extractor": { "csv": {} }, "transformers": [ { "vertex": { "class": "A" } } ], "loader": { "orientdb": { "dbURL": "plocal:./test.orientdb", "dbType": "graph", "dbAutoCreate": true, "classes": [ {"name": "A", "extends": "V"}, {"name": "B", "extends": "V"}, {"name": "Conn", "extends": "E"} ] } } } data2.json { "source": { "file": { "path": "./data2.csv" } }, "extractor": { "csv": {} }, "transformers": [ { "vertex": { "class": "B" } }, { "edge": { "class": "Conn", "joinFieldName": "b1", "lookup": "A.a1", "joinFieldName": "b2", "lookup": "A.a2", "direction": "out" }} ], "loader": { "orientdb": { "dbURL": "plocal:./test.orientdb", "dbType": "graph", "dbAutoCreate": true, "classes": [ {"name": "B", "extends": "V"}, {"name": "Conn", "extends": "E"} ] } } } the result from running oetl.sh on data1.json then data2.json gives me this: orientdb {db=test.orientdb}> select from v +----+-----+------+----+----+-------------+----+----+-------------+ |# |@RID |@CLASS|a1 |a2 |in_Conn |b2 |b1 |out_Conn | +----+-----+------+----+----+-------------+----+----+-------------+ |0 |#25:0|A |1 |1 |[#41:0,#45:0]| | | | |1 |#26:0|A |1 |2 |[#44:0] | | | | |2 |#27:0|A |2 |3 |[#43:0] | | | | |3 |#28:0|A |2 |1 |[#42:0,#46:0]| | | | |4 |#33:0|B | | | |1 |1 |[#41:0,#42:0]| |5 |#34:0|B | | | |3 |2 |[#43:0] | |6 |#35:0|B | | | |2 |1 |[#44:0] | |7 |#36:0|B | | | |1 |2 |[#45:0,#46:0]| +----+-----+------+----+----+-------------+----+----+-------------+ 8 item(s) found. Query executed in 0.01 sec(s). which seems wrong to me... if I write out the edges: A(1,1) <-- #41:0 --- B(1,1) OK A(1,1) <-- #45:0 --- B(2,1) WRONG A(1,2) <-- #44:0 --- B(1,2) OK A(2,3) <-- #43:0 --- B(2,3) OK A(2,1) <-- #42:0 --- B(1,1) WRONG A(2,1) <-- #46:0 --- B(2,1) OK My understanding here is that the two joinFieldName entries *should* be creating an AND operation between the two keys... so I expect to match an A to a B if A.a1 == B.b1 AND A.a2 == B.b2, but this isn't what is happening. From the looks of it, the first joinFieldName is ignored and the 2nd joinFieldName entry is the thing that's actually used to match. Is this a bug? If not and it's working as intended, how can I set up something in ETL to generate edges between nodes based on more than one field? Thanks! -William -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
