Re: [orientdb] Embedded List Query Performance In OrientDB

Wise Jack Sat, 19 Apr 2014 22:07:22 -0700

Has anyone got a suggestion?

On Tuesday, April 15, 2014 10:00:19 AM UTC+8, Wise Jack wrote:
>
> Hi, Andrey:
>
> The result is the same, unacceptable slow, here is the explain:
>
> orientdb {compounds}> explain select * from Compound where '000075-72-9' 
> in cas
>
> Profiled command 
> '{involvedIndexes:[1],current:#11:960477,fetchingFromTargetElapsed:327390,documentReads:959211,documentAnalyzedCompatibleClass:959211,recordReads:959211,elapsed:327501.62,resultType:collection,resultSize:1}'
>  
> in 327.528992 sec(s):
>
> {"@type":"d","@version":0,"involvedIndexes":["Compound.cas"],"current":"#11:960477","fetchingFromTargetElapsed":327390,"documentReads":959211,"documentAnalyzedCompatibleClass":959211,"recordReads":959211,"elapsed":327501.62,"resultType":"collection","resultSize":1,"@fieldTypes":"involvedIndexes=e,fetchingFromTargetElapsed=l,documentReads=l,documentAnalyzedCompatibleClass=l,recordReads=l,elapsed=f"}
>
> On Friday, April 11, 2014 6:15:39 PM UTC+8, Andrey Lomakin wrote:
>>
>> HI,
>> Could you try now ?
>>
>>
>> On Tue, Apr 8, 2014 at 5:13 PM, Wise Jack <[email protected]> wrote:
>>
>>> Hi, Andrey:
>>>
>>> Sure. I'll send you a sample document of the database, I can't send the 
>>> whole database to you since it's too large:
>>>
>>> This is a sample record of the database, I'm immigrating a chemical 
>>> compounds database from MySQL to OrientDB.
>>> --------------------------------------------------
>>> ODocument - Class: Compound   id: #11:5111   v.1
>>> --------------------------------------------------
>>>       iupac_cas_name : chloro(trifluoro)methane
>>>          create_date : Sat Jan 17 00:00:00 CST 1970
>>> iupac_traditional_name : chloro(trifluoro)methane
>>> cactvs_hbond_acceptor : 3
>>>      component_count : 1
>>>   cactvs_tauto_count : 1
>>>      nonstandardbond : null
>>>     molecular_weight : 104.45891
>>>      coordinate_type : 1
>>> 5
>>> 255
>>>  monoisotopic_weight : 103.964066
>>>       iupac_inchikey : AFYPFACVUDMOHA-UHFFFAOYSA-N
>>>           exact_mass : 103.964066
>>>               xlogp3 : 2.0
>>>           iupac_name : chloro(trifluoro)methane
>>>   openeye_iso_smiles : C(F)(F)(F)Cl
>>> compound_canonicalized : 1
>>>  isotopic_atom_count : 0
>>>      cactvs_subskeys : 
>>> AAADcQAAAYAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQIAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
>>> atom_udef_stereo_count : 0
>>>    cactvs_complexity : 28
>>> iupac_systematic_name : chloranyl-tris(fluoranyl)methane
>>> bond_udef_stereo_count : 0
>>> bond_def_stereo_count : 0
>>>   cactvs_hbond_donor : 0
>>>      bondannotations : undefined
>>>          cactvs_tpsa : 0
>>>                  cas : [75-72-9, 185009-43-2
>>> 75-72-9, 50815-73-1, 000075-72-9, 185009-43-2, 4-01-00-00034 (Beilstein 
>>> Handbook Reference)]
>>>   openeye_can_smiles : C(F)(F)(F)Cl
>>>     heavy_atom_count : 5
>>>   iupac_openeye_name : chloro(trifluoro)methane
>>>          iupac_inchi : InChI=1S/CClF3/c2-1(3,4)5
>>>          modify_date : Sat Jan 17 00:00:00 CST 1970
>>>    molecular_formula : CClF3
>>>         total_charge : 0
>>>         compound_cid : 6392
>>> atom_def_stereo_count : 0
>>> cactvs_rotatable_bond : 0
>>>
>>> The embedded list field is the CAS field.
>>>
>>> The schema of Class Compound is as the attachment.
>>>
>>> On Tuesday, April 8, 2014 9:09:30 PM UTC+8, Andrey Lomakin wrote:
>>>
>>>> Could you provide database sample ?
>>>>
>>>>
>>>> On Tue, Apr 8, 2014 at 8:51 AM, Wise Jack <[email protected]> wrote:
>>>>
>>>>> Hi, Andrey.
>>>>>
>>>>> Thanks for your reply. The memory information is as below:
>>>>>
>>>>> [root@root ~]# cat /proc/meminfo
>>>>> MemTotal:        8063160 kB
>>>>> MemFree:          228968 kB
>>>>>
>>>>> As you can see 
>>>>>
>>>>>      "involvedIndexes":["ClassA.fieldA"],
>>>>>      "current":"#11:960477",
>>>>>      "fetchingFromTargetElapsed":160596,
>>>>>      "documentReads":959211,
>>>>>
>>>>> Even the database can see the index, but it still iterate all the 
>>>>> documents in the database, I think that's the reason for the slow.
>>>>>
>>>>> The same data in mysql(that using fieldA's index), can return data in 
>>>>> 0.015second, so I think this is not the fault of the data, maybe there is 
>>>>> a 
>>>>> better way for creating index or querying using index for embedded list 
>>>>> of 
>>>>> OrientDB.
>>>>>
>>>>> On Monday, April 7, 2014 5:25:27 PM UTC+8, Andrey Lomakin wrote:
>>>>>
>>>>>> Yes too slow.
>>>>>> What amount of RAM do you have ?-
>>>>>>
>>>>>>
>>>>>> On Mon, Apr 7, 2014 at 5:33 AM, Wise Jack <[email protected]> wrote:
>>>>>>
>>>>>>> I'm testing orientdb for a storage database of a knowledge base.
>>>>>>>
>>>>>>> The database can be something like this:
>>>>>>>
>>>>>>> [
>>>>>>>     {
>>>>>>>         fieldA: ['a','b','c']
>>>>>>>     },
>>>>>>>     {
>>>>>>>         fieldA: ['c','d','e']
>>>>>>>     },
>>>>>>> ]
>>>>>>>
>>>>>>>
>>>>>>> and the query is something like this:
>>>>>>>
>>>>>>> select from ClassA where 'c' in fieldA
>>>>>>>
>>>>>>>
>>>>>>> The query is very very slow, the explain of the query is as below
>>>>>>>
>>>>>>> {
>>>>>>>     "@type":"d","@version":0,
>>>>>>>      "involvedIndexes":["ClassA.fieldA"],
>>>>>>>      "current":"#11:960477",
>>>>>>>      "fetchingFromTargetElapsed":160596,
>>>>>>>      "documentReads":959211,
>>>>>>>      "documentAnalyzedCompatibleClass":959211,
>>>>>>>      "recordReads":959211,
>>>>>>>      "elapsed":160596.25,
>>>>>>>      "resultType":"collection",
>>>>>>>      "resultSize":1,
>>>>>>>      
>>>>>>> "@fieldTypes":"involvedIndexes=e,fetchingFromTargetElapsed=l,documentReads=l,documentAnalyzedCompatibleClass=l,recordReads=l,elapsed=f"
>>>>>>>  }
>>>>>>>
>>>>>>> As you can see, even OrientDB used the fieldA index, it still costs 
>>>>>>> 16 seconds to query a million records, it is unacceptable.
>>>>>>>
>>>>>>> Is there any good way to make this query faster?
>>>>>>>
>>>>>>> https://stackoverflow.com/questions/22896528/embedded-list-
>>>>>>> query-performance-in-orientdb
>>>>>>>  
>>>>>>> -- 
>>>>>>>
>>>>>>> --- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "OrientDB" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to [email protected].
>>>>>>>
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Best regards,
>>>>>> Andrey Lomakin.
>>>>>>
>>>>>> Orient Technologies
>>>>>> the Company behind OrientDB
>>>>>>
>>>>>>   -- 
>>>>>
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "OrientDB" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>>
>>>> -- 
>>>> Best regards,
>>>> Andrey Lomakin.
>>>>
>>>> Orient Technologies
>>>> the Company behind OrientDB
>>>>
>>>>   -- 
>>>
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> -- 
>> Best regards,
>> Andrey Lomakin.
>>
>> Orient Technologies
>> the Company behind OrientDB
>>
>>


-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [orientdb] Embedded List Query Performance In OrientDB

Reply via email to