Hi Roberto,
Could you provide a simple example of what you need in terms of source
tables and expected result in OrientDB?

Lvc@
ᐧ

On 2 October 2014 09:56, Roberto Cornacchia <[email protected]>
wrote:

> Thanks Luca,
>
> I just had a look at the documentation for ETL.
> If I understand correctly, it's not possible to start from the 3 separate
> tables like above (term, doc, term_doc), as the joinFieldName needs to
> refer to a field of one of the Vertex tables (either term or doc in this
> case).
> Is that correct? Or can this still be done?
> So I'd have to join the tables before using the ETL module?
>
>
> Best, Roberto
>
> On Wednesday, 1 October 2014 18:47:28 UTC+2, Lvc@ wrote:
>>
>> Hi Roberto,
>> I suggest you to use the ETL module: much more powerful and fast:
>>
>> http://www.orientechnologies.com/docs/last/orientdb-etl.
>> wiki/Import-from-DBMS.html
>>
>> Lvc@
>>
>> ᐧ
>>
>> On 1 October 2014 17:56, Roberto Cornacchia <[email protected]>
>> wrote:
>>
>>> Hi there,
>>>
>>> I'm trying to approach OrientDB, coming from relational background.
>>>
>>> I've already looked at this: http://www.orientechnologies.com/docs/
>>> last/orientdb.wiki/Import-RDBMS-to-Graph-Model.html
>>>
>>> However, I'm not sure how I would do this when I start from a
>>> many-to-many relationship.
>>>
>>> *Example (term-doc matrix as used in information retrieval - which term
>>> occurs in which document)*
>>>
>>> -- all terms
>>> CREATE TABLE term (id INTEGER, term STRING);
>>> INSERT INTO term VALUES (0, 'OrientDB');
>>> INSERT INTO term VALUES (1, 'is');
>>> INSERT INTO term VALUES (2, 'cool');
>>>
>>> -- all docs
>>> CREATE TABLE doc (id INTEGER, title STRING);
>>> INSERT INTO doc VALUES (10, 'manual');
>>> INSERT INTO doc VALUES (11, 'license');
>>>
>>> -- many-to-many relations
>>> CREATE TABLE term_doc (term_id INTEGER, doc_id INTEGER);
>>> INSERT INTO term_doc VALUES (0, 10);
>>> INSERT INTO term_doc VALUES (0, 11);
>>> INSERT INTO term_doc VALUES (1, 10);
>>> INSERT INTO term_doc VALUES (2, 10);
>>>
>>> *First question:*
>>> With OrientDB, turning the term and the doc records into Vertex records
>>> is easy. No problem there.
>>> How could I take the content of term_doc and create the respective
>>> edges? Can this be done in the OrientDB SQL console?
>>>
>>>
>>> *Second question:*Supposing I have solved the first issue, and I want
>>> to count how many times each term appears in each document.
>>> In relational, this would be:
>>> SELECT term_id, doc_id, count(*)
>>> FROM   term_doc
>>> GROUP  BY term,doc;
>>>
>>> Now, I suppose this is not how you would do it once you have modeled the
>>> term-doc problem into a graph database. Also because when I tried to do run
>>> this query against a simple import of term_doc, it was incredibly slow.
>>> So, supposing I have real edges between term and doc, instead of the
>>> explicit table term_doc, how could I obtain the count I mention above?
>>>
>>> Thanks!
>>> Roberto
>>>
>>> --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to