Hi Jack,

if you execute this:

select count(*) from Compound where '000075-72-9' in cas

How many records are retrieved? How much time does it take?

Lvc@



On 20 April 2014 07:07, Wise Jack <[email protected]> wrote:

> Has anyone got a suggestion?
>
>
> On Tuesday, April 15, 2014 10:00:19 AM UTC+8, Wise Jack wrote:
>>
>> Hi, Andrey:
>>
>> The result is the same, unacceptable slow, here is the explain:
>>
>> orientdb {compounds}> explain select * from Compound where '000075-72-9'
>> in cas
>>
>> Profiled command '{involvedIndexes:[1],current:#11:960477,
>> fetchingFromTargetElapsed:327390,documentReads:959211,
>> documentAnalyzedCompatibleClass:959211,recordReads:959211,
>> elapsed:327501.62,resultType:collection,resultSize:1}' in 327.528992
>> sec(s):
>> {"@type":"d","@version":0,"involvedIndexes":["Compound.
>> cas"],"current":"#11:960477","fetchingFromTargetElapsed":
>> 327390,"documentReads":959211,"documentAnalyzedCompatibleClas
>> s":959211,"recordReads":959211,"elapsed":327501.62,"
>> resultType":"collection","resultSize":1,"@fieldTypes":"involvedIndexes=e,
>> fetchingFromTargetElapsed=l,documentReads=l,
>> documentAnalyzedCompatibleClass=l,recordReads=l,elapsed=f"}
>>
>> On Friday, April 11, 2014 6:15:39 PM UTC+8, Andrey Lomakin wrote:
>>>
>>> HI,
>>> Could you try now ?
>>>
>>>
>>> On Tue, Apr 8, 2014 at 5:13 PM, Wise Jack <[email protected]> wrote:
>>>
>>>> Hi, Andrey:
>>>>
>>>> Sure. I'll send you a sample document of the database, I can't send the
>>>> whole database to you since it's too large:
>>>>
>>>> This is a sample record of the database, I'm immigrating a chemical
>>>> compounds database from MySQL to OrientDB.
>>>> --------------------------------------------------
>>>> ODocument - Class: Compound   id: #11:5111   v.1
>>>> --------------------------------------------------
>>>>       iupac_cas_name : chloro(trifluoro)methane
>>>>          create_date : Sat Jan 17 00:00:00 CST 1970
>>>> iupac_traditional_name : chloro(trifluoro)methane
>>>> cactvs_hbond_acceptor : 3
>>>>      component_count : 1
>>>>   cactvs_tauto_count : 1
>>>>      nonstandardbond : null
>>>>     molecular_weight : 104.45891
>>>>      coordinate_type : 1
>>>> 5
>>>> 255
>>>>  monoisotopic_weight : 103.964066
>>>>       iupac_inchikey : AFYPFACVUDMOHA-UHFFFAOYSA-N
>>>>           exact_mass : 103.964066
>>>>               xlogp3 : 2.0
>>>>           iupac_name : chloro(trifluoro)methane
>>>>   openeye_iso_smiles : C(F)(F)(F)Cl
>>>> compound_canonicalized : 1
>>>>  isotopic_atom_count : 0
>>>>      cactvs_subskeys : AAADcQAAAYAEAAAAAAAAAAAAAAAAAA
>>>> AAAAAAAAAAAAAAAAAAAAAAAQIAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAA
>>>> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
>>>> atom_udef_stereo_count : 0
>>>>    cactvs_complexity : 28
>>>> iupac_systematic_name : chloranyl-tris(fluoranyl)methane
>>>> bond_udef_stereo_count : 0
>>>> bond_def_stereo_count : 0
>>>>   cactvs_hbond_donor : 0
>>>>      bondannotations : undefined
>>>>          cactvs_tpsa : 0
>>>>                  cas : [75-72-9, 185009-43-2
>>>> 75-72-9, 50815-73-1, 000075-72-9, 185009-43-2, 4-01-00-00034 (Beilstein
>>>> Handbook Reference)]
>>>>   openeye_can_smiles : C(F)(F)(F)Cl
>>>>     heavy_atom_count : 5
>>>>   iupac_openeye_name : chloro(trifluoro)methane
>>>>          iupac_inchi : InChI=1S/CClF3/c2-1(3,4)5
>>>>          modify_date : Sat Jan 17 00:00:00 CST 1970
>>>>    molecular_formula : CClF3
>>>>         total_charge : 0
>>>>         compound_cid : 6392
>>>> atom_def_stereo_count : 0
>>>> cactvs_rotatable_bond : 0
>>>>
>>>> The embedded list field is the CAS field.
>>>>
>>>> The schema of Class Compound is as the attachment.
>>>>
>>>> On Tuesday, April 8, 2014 9:09:30 PM UTC+8, Andrey Lomakin wrote:
>>>>
>>>>> Could you provide database sample ?
>>>>>
>>>>>
>>>>> On Tue, Apr 8, 2014 at 8:51 AM, Wise Jack <[email protected]> wrote:
>>>>>
>>>>>> Hi, Andrey.
>>>>>>
>>>>>> Thanks for your reply. The memory information is as below:
>>>>>>
>>>>>> [root@root ~]# cat /proc/meminfo
>>>>>> MemTotal:        8063160 kB
>>>>>> MemFree:          228968 kB
>>>>>>
>>>>>> As you can see
>>>>>>
>>>>>>      "involvedIndexes":["ClassA.fieldA"],
>>>>>>      "current":"#11:960477",
>>>>>>      "fetchingFromTargetElapsed":160596,
>>>>>>      "documentReads":959211,
>>>>>>
>>>>>> Even the database can see the index, but it still iterate all the
>>>>>> documents in the database, I think that's the reason for the slow.
>>>>>>
>>>>>> The same data in mysql(that using fieldA's index), can return data in
>>>>>> 0.015second, so I think this is not the fault of the data, maybe there 
>>>>>> is a
>>>>>> better way for creating index or querying using index for embedded list 
>>>>>> of
>>>>>> OrientDB.
>>>>>>
>>>>>> On Monday, April 7, 2014 5:25:27 PM UTC+8, Andrey Lomakin wrote:
>>>>>>
>>>>>>> Yes too slow.
>>>>>>> What amount of RAM do you have ?-
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Apr 7, 2014 at 5:33 AM, Wise Jack <[email protected]>wrote:
>>>>>>>
>>>>>>>> I'm testing orientdb for a storage database of a knowledge base.
>>>>>>>>
>>>>>>>> The database can be something like this:
>>>>>>>>
>>>>>>>> [
>>>>>>>>     {
>>>>>>>>         fieldA: ['a','b','c']
>>>>>>>>     },
>>>>>>>>     {
>>>>>>>>         fieldA: ['c','d','e']
>>>>>>>>     },
>>>>>>>> ]
>>>>>>>>
>>>>>>>>
>>>>>>>> and the query is something like this:
>>>>>>>>
>>>>>>>> select from ClassA where 'c' in fieldA
>>>>>>>>
>>>>>>>>
>>>>>>>> The query is very very slow, the explain of the query is as below
>>>>>>>>
>>>>>>>> {
>>>>>>>>     "@type":"d","@version":0,
>>>>>>>>      "involvedIndexes":["ClassA.fieldA"],
>>>>>>>>      "current":"#11:960477",
>>>>>>>>      "fetchingFromTargetElapsed":160596,
>>>>>>>>      "documentReads":959211,
>>>>>>>>      "documentAnalyzedCompatibleClass":959211,
>>>>>>>>      "recordReads":959211,
>>>>>>>>      "elapsed":160596.25,
>>>>>>>>      "resultType":"collection",
>>>>>>>>      "resultSize":1,
>>>>>>>>      
>>>>>>>> "@fieldTypes":"involvedIndexes=e,fetchingFromTargetElapsed=l,documentReads=l,documentAnalyzedCompatibleClass=l,recordReads=l,elapsed=f"
>>>>>>>>  }
>>>>>>>>
>>>>>>>> As you can see, even OrientDB used the fieldA index, it still costs
>>>>>>>> 16 seconds to query a million records, it is unacceptable.
>>>>>>>>
>>>>>>>> Is there any good way to make this query faster?
>>>>>>>>
>>>>>>>> https://stackoverflow.com/questions/22896528/embedded-list-q
>>>>>>>> uery-performance-in-orientdb
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> ---
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "OrientDB" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>>
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Andrey Lomakin.
>>>>>>>
>>>>>>> Orient Technologies
>>>>>>> the Company behind OrientDB
>>>>>>>
>>>>>>>   --
>>>>>>
>>>>>> ---
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "OrientDB" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Andrey Lomakin.
>>>>>
>>>>> Orient Technologies
>>>>> the Company behind OrientDB
>>>>>
>>>>>   --
>>>>
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "OrientDB" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Andrey Lomakin.
>>>
>>> Orient Technologies
>>> the Company behind OrientDB
>>>
>>>   --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to