Hi, Andrey:
The result is the same, unacceptable slow, here is the explain:
orientdb {compounds}> explain select * from Compound where '000075-72-9' in
cas
Profiled command
'{involvedIndexes:[1],current:#11:960477,fetchingFromTargetElapsed:327390,documentReads:959211,documentAnalyzedCompatibleClass:959211,recordReads:959211,elapsed:327501.62,resultType:collection,resultSize:1}'
in 327.528992 sec(s):
{"@type":"d","@version":0,"involvedIndexes":["Compound.cas"],"current":"#11:960477","fetchingFromTargetElapsed":327390,"documentReads":959211,"documentAnalyzedCompatibleClass":959211,"recordReads":959211,"elapsed":327501.62,"resultType":"collection","resultSize":1,"@fieldTypes":"involvedIndexes=e,fetchingFromTargetElapsed=l,documentReads=l,documentAnalyzedCompatibleClass=l,recordReads=l,elapsed=f"}
On Friday, April 11, 2014 6:15:39 PM UTC+8, Andrey Lomakin wrote:
>
> HI,
> Could you try now ?
>
>
> On Tue, Apr 8, 2014 at 5:13 PM, Wise Jack <[email protected]<javascript:>
> > wrote:
>
>> Hi, Andrey:
>>
>> Sure. I'll send you a sample document of the database, I can't send the
>> whole database to you since it's too large:
>>
>> This is a sample record of the database, I'm immigrating a chemical
>> compounds database from MySQL to OrientDB.
>> --------------------------------------------------
>> ODocument - Class: Compound id: #11:5111 v.1
>> --------------------------------------------------
>> iupac_cas_name : chloro(trifluoro)methane
>> create_date : Sat Jan 17 00:00:00 CST 1970
>> iupac_traditional_name : chloro(trifluoro)methane
>> cactvs_hbond_acceptor : 3
>> component_count : 1
>> cactvs_tauto_count : 1
>> nonstandardbond : null
>> molecular_weight : 104.45891
>> coordinate_type : 1
>> 5
>> 255
>> monoisotopic_weight : 103.964066
>> iupac_inchikey : AFYPFACVUDMOHA-UHFFFAOYSA-N
>> exact_mass : 103.964066
>> xlogp3 : 2.0
>> iupac_name : chloro(trifluoro)methane
>> openeye_iso_smiles : C(F)(F)(F)Cl
>> compound_canonicalized : 1
>> isotopic_atom_count : 0
>> cactvs_subskeys :
>> AAADcQAAAYAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQIAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
>> atom_udef_stereo_count : 0
>> cactvs_complexity : 28
>> iupac_systematic_name : chloranyl-tris(fluoranyl)methane
>> bond_udef_stereo_count : 0
>> bond_def_stereo_count : 0
>> cactvs_hbond_donor : 0
>> bondannotations : undefined
>> cactvs_tpsa : 0
>> cas : [75-72-9, 185009-43-2
>> 75-72-9, 50815-73-1, 000075-72-9, 185009-43-2, 4-01-00-00034 (Beilstein
>> Handbook Reference)]
>> openeye_can_smiles : C(F)(F)(F)Cl
>> heavy_atom_count : 5
>> iupac_openeye_name : chloro(trifluoro)methane
>> iupac_inchi : InChI=1S/CClF3/c2-1(3,4)5
>> modify_date : Sat Jan 17 00:00:00 CST 1970
>> molecular_formula : CClF3
>> total_charge : 0
>> compound_cid : 6392
>> atom_def_stereo_count : 0
>> cactvs_rotatable_bond : 0
>>
>> The embedded list field is the CAS field.
>>
>> The schema of Class Compound is as the attachment.
>>
>> On Tuesday, April 8, 2014 9:09:30 PM UTC+8, Andrey Lomakin wrote:
>>
>>> Could you provide database sample ?
>>>
>>>
>>> On Tue, Apr 8, 2014 at 8:51 AM, Wise Jack <[email protected]> wrote:
>>>
>>>> Hi, Andrey.
>>>>
>>>> Thanks for your reply. The memory information is as below:
>>>>
>>>> [root@root ~]# cat /proc/meminfo
>>>> MemTotal: 8063160 kB
>>>> MemFree: 228968 kB
>>>>
>>>> As you can see
>>>>
>>>> "involvedIndexes":["ClassA.fieldA"],
>>>> "current":"#11:960477",
>>>> "fetchingFromTargetElapsed":160596,
>>>> "documentReads":959211,
>>>>
>>>> Even the database can see the index, but it still iterate all the
>>>> documents in the database, I think that's the reason for the slow.
>>>>
>>>> The same data in mysql(that using fieldA's index), can return data in
>>>> 0.015second, so I think this is not the fault of the data, maybe there is
>>>> a
>>>> better way for creating index or querying using index for embedded list of
>>>> OrientDB.
>>>>
>>>> On Monday, April 7, 2014 5:25:27 PM UTC+8, Andrey Lomakin wrote:
>>>>
>>>>> Yes too slow.
>>>>> What amount of RAM do you have ?-
>>>>>
>>>>>
>>>>> On Mon, Apr 7, 2014 at 5:33 AM, Wise Jack <[email protected]> wrote:
>>>>>
>>>>>> I'm testing orientdb for a storage database of a knowledge base.
>>>>>>
>>>>>> The database can be something like this:
>>>>>>
>>>>>> [
>>>>>> {
>>>>>> fieldA: ['a','b','c']
>>>>>> },
>>>>>> {
>>>>>> fieldA: ['c','d','e']
>>>>>> },
>>>>>> ]
>>>>>>
>>>>>>
>>>>>> and the query is something like this:
>>>>>>
>>>>>> select from ClassA where 'c' in fieldA
>>>>>>
>>>>>>
>>>>>> The query is very very slow, the explain of the query is as below
>>>>>>
>>>>>> {
>>>>>> "@type":"d","@version":0,
>>>>>> "involvedIndexes":["ClassA.fieldA"],
>>>>>> "current":"#11:960477",
>>>>>> "fetchingFromTargetElapsed":160596,
>>>>>> "documentReads":959211,
>>>>>> "documentAnalyzedCompatibleClass":959211,
>>>>>> "recordReads":959211,
>>>>>> "elapsed":160596.25,
>>>>>> "resultType":"collection",
>>>>>> "resultSize":1,
>>>>>>
>>>>>> "@fieldTypes":"involvedIndexes=e,fetchingFromTargetElapsed=l,documentReads=l,documentAnalyzedCompatibleClass=l,recordReads=l,elapsed=f"
>>>>>> }
>>>>>>
>>>>>> As you can see, even OrientDB used the fieldA index, it still costs
>>>>>> 16 seconds to query a million records, it is unacceptable.
>>>>>>
>>>>>> Is there any good way to make this query faster?
>>>>>>
>>>>>> https://stackoverflow.com/questions/22896528/embedded-list-
>>>>>> query-performance-in-orientdb
>>>>>>
>>>>>> --
>>>>>>
>>>>>> ---
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "OrientDB" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>>
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Andrey Lomakin.
>>>>>
>>>>> Orient Technologies
>>>>> the Company behind OrientDB
>>>>>
>>>>> --
>>>>
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "OrientDB" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Andrey Lomakin.
>>>
>>> Orient Technologies
>>> the Company behind OrientDB
>>>
>>> --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Best regards,
> Andrey Lomakin.
>
> Orient Technologies
> the Company behind OrientDB
>
>
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.