Thanks Luca,

I just had a look at the documentation for ETL.
If I understand correctly, it's not possible to start from the 3 separate 
tables like above (term, doc, term_doc), as the joinFieldName needs to 
refer to a field of one of the Vertex tables (either term or doc in this 
case). 
Is that correct? Or can this still be done?
So I'd have to join the tables before using the ETL module?


Best, Roberto

On Wednesday, 1 October 2014 18:47:28 UTC+2, Lvc@ wrote:
>
> Hi Roberto,
> I suggest you to use the ETL module: much more powerful and fast:
>
>
> http://www.orientechnologies.com/docs/last/orientdb-etl.wiki/Import-from-DBMS.html
>
> Lvc@
>
> ᐧ
>
> On 1 October 2014 17:56, Roberto Cornacchia <[email protected] 
> <javascript:>> wrote:
>
>> Hi there,
>>
>> I'm trying to approach OrientDB, coming from relational background.
>>
>> I've already looked at this: 
>> http://www.orientechnologies.com/docs/last/orientdb.wiki/Import-RDBMS-to-Graph-Model.html
>>
>> However, I'm not sure how I would do this when I start from a 
>> many-to-many relationship.
>>
>> *Example (term-doc matrix as used in information retrieval - which term 
>> occurs in which document)*
>>
>> -- all terms
>> CREATE TABLE term (id INTEGER, term STRING);
>> INSERT INTO term VALUES (0, 'OrientDB');
>> INSERT INTO term VALUES (1, 'is');
>> INSERT INTO term VALUES (2, 'cool');
>>
>> -- all docs
>> CREATE TABLE doc (id INTEGER, title STRING);
>> INSERT INTO doc VALUES (10, 'manual');
>> INSERT INTO doc VALUES (11, 'license');
>>
>> -- many-to-many relations
>> CREATE TABLE term_doc (term_id INTEGER, doc_id INTEGER);
>> INSERT INTO term_doc VALUES (0, 10);
>> INSERT INTO term_doc VALUES (0, 11);
>> INSERT INTO term_doc VALUES (1, 10);
>> INSERT INTO term_doc VALUES (2, 10);
>>
>> *First question:*
>> With OrientDB, turning the term and the doc records into Vertex records 
>> is easy. No problem there.
>> How could I take the content of term_doc and create the respective edges? 
>> Can this be done in the OrientDB SQL console?
>>
>>
>> *Second question:*Supposing I have solved the first issue, and I want to 
>> count how many times each term appears in each document.
>> In relational, this would be:
>> SELECT term_id, doc_id, count(*)
>> FROM   term_doc
>> GROUP  BY term,doc;
>>
>> Now, I suppose this is not how you would do it once you have modeled the 
>> term-doc problem into a graph database. Also because when I tried to do run 
>> this query against a simple import of term_doc, it was incredibly slow.
>> So, supposing I have real edges between term and doc, instead of the 
>> explicit table term_doc, how could I obtain the count I mention above?
>>
>> Thanks!
>> Roberto
>>
>> -- 
>>
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to