Has anyone got a suggestion?
On Tuesday, April 15, 2014 10:00:19 AM UTC+8, Wise Jack wrote:
>
> Hi, Andrey:
>
> The result is the same, unacceptable slow, here is the explain:
>
> orientdb {compounds}> explain select * from Compound where '000075-72-9'
> in cas
>
> Profiled command
> '{involvedIndexes:[1],current:#11:960477,fetchingFromTargetElapsed:327390,documentReads:959211,documentAnalyzedCompatibleClass:959211,recordReads:959211,elapsed:327501.62,resultType:collection,resultSize:1}'
>
> in 327.528992 sec(s):
>
> {"@type":"d","@version":0,"involvedIndexes":["Compound.cas"],"current":"#11:960477","fetchingFromTargetElapsed":327390,"documentReads":959211,"documentAnalyzedCompatibleClass":959211,"recordReads":959211,"elapsed":327501.62,"resultType":"collection","resultSize":1,"@fieldTypes":"involvedIndexes=e,fetchingFromTargetElapsed=l,documentReads=l,documentAnalyzedCompatibleClass=l,recordReads=l,elapsed=f"}
>
> On Friday, April 11, 2014 6:15:39 PM UTC+8, Andrey Lomakin wrote:
>>
>> HI,
>> Could you try now ?
>>
>>
>> On Tue, Apr 8, 2014 at 5:13 PM, Wise Jack <[email protected]> wrote:
>>
>>> Hi, Andrey:
>>>
>>> Sure. I'll send you a sample document of the database, I can't send the
>>> whole database to you since it's too large:
>>>
>>> This is a sample record of the database, I'm immigrating a chemical
>>> compounds database from MySQL to OrientDB.
>>> --------------------------------------------------
>>> ODocument - Class: Compound id: #11:5111 v.1
>>> --------------------------------------------------
>>> iupac_cas_name : chloro(trifluoro)methane
>>> create_date : Sat Jan 17 00:00:00 CST 1970
>>> iupac_traditional_name : chloro(trifluoro)methane
>>> cactvs_hbond_acceptor : 3
>>> component_count : 1
>>> cactvs_tauto_count : 1
>>> nonstandardbond : null
>>> molecular_weight : 104.45891
>>> coordinate_type : 1
>>> 5
>>> 255
>>> monoisotopic_weight : 103.964066
>>> iupac_inchikey : AFYPFACVUDMOHA-UHFFFAOYSA-N
>>> exact_mass : 103.964066
>>> xlogp3 : 2.0
>>> iupac_name : chloro(trifluoro)methane
>>> openeye_iso_smiles : C(F)(F)(F)Cl
>>> compound_canonicalized : 1
>>> isotopic_atom_count : 0
>>> cactvs_subskeys :
>>> AAADcQAAAYAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQIAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
>>> atom_udef_stereo_count : 0
>>> cactvs_complexity : 28
>>> iupac_systematic_name : chloranyl-tris(fluoranyl)methane
>>> bond_udef_stereo_count : 0
>>> bond_def_stereo_count : 0
>>> cactvs_hbond_donor : 0
>>> bondannotations : undefined
>>> cactvs_tpsa : 0
>>> cas : [75-72-9, 185009-43-2
>>> 75-72-9, 50815-73-1, 000075-72-9, 185009-43-2, 4-01-00-00034 (Beilstein
>>> Handbook Reference)]
>>> openeye_can_smiles : C(F)(F)(F)Cl
>>> heavy_atom_count : 5
>>> iupac_openeye_name : chloro(trifluoro)methane
>>> iupac_inchi : InChI=1S/CClF3/c2-1(3,4)5
>>> modify_date : Sat Jan 17 00:00:00 CST 1970
>>> molecular_formula : CClF3
>>> total_charge : 0
>>> compound_cid : 6392
>>> atom_def_stereo_count : 0
>>> cactvs_rotatable_bond : 0
>>>
>>> The embedded list field is the CAS field.
>>>
>>> The schema of Class Compound is as the attachment.
>>>
>>> On Tuesday, April 8, 2014 9:09:30 PM UTC+8, Andrey Lomakin wrote:
>>>
>>>> Could you provide database sample ?
>>>>
>>>>
>>>> On Tue, Apr 8, 2014 at 8:51 AM, Wise Jack <[email protected]> wrote:
>>>>
>>>>> Hi, Andrey.
>>>>>
>>>>> Thanks for your reply. The memory information is as below:
>>>>>
>>>>> [root@root ~]# cat /proc/meminfo
>>>>> MemTotal: 8063160 kB
>>>>> MemFree: 228968 kB
>>>>>
>>>>> As you can see
>>>>>
>>>>> "involvedIndexes":["ClassA.fieldA"],
>>>>> "current":"#11:960477",
>>>>> "fetchingFromTargetElapsed":160596,
>>>>> "documentReads":959211,
>>>>>
>>>>> Even the database can see the index, but it still iterate all the
>>>>> documents in the database, I think that's the reason for the slow.
>>>>>
>>>>> The same data in mysql(that using fieldA's index), can return data in
>>>>> 0.015second, so I think this is not the fault of the data, maybe there is
>>>>> a
>>>>> better way for creating index or querying using index for embedded list
>>>>> of
>>>>> OrientDB.
>>>>>
>>>>> On Monday, April 7, 2014 5:25:27 PM UTC+8, Andrey Lomakin wrote:
>>>>>
>>>>>> Yes too slow.
>>>>>> What amount of RAM do you have ?-
>>>>>>
>>>>>>
>>>>>> On Mon, Apr 7, 2014 at 5:33 AM, Wise Jack <[email protected]> wrote:
>>>>>>
>>>>>>> I'm testing orientdb for a storage database of a knowledge base.
>>>>>>>
>>>>>>> The database can be something like this:
>>>>>>>
>>>>>>> [
>>>>>>> {
>>>>>>> fieldA: ['a','b','c']
>>>>>>> },
>>>>>>> {
>>>>>>> fieldA: ['c','d','e']
>>>>>>> },
>>>>>>> ]
>>>>>>>
>>>>>>>
>>>>>>> and the query is something like this:
>>>>>>>
>>>>>>> select from ClassA where 'c' in fieldA
>>>>>>>
>>>>>>>
>>>>>>> The query is very very slow, the explain of the query is as below
>>>>>>>
>>>>>>> {
>>>>>>> "@type":"d","@version":0,
>>>>>>> "involvedIndexes":["ClassA.fieldA"],
>>>>>>> "current":"#11:960477",
>>>>>>> "fetchingFromTargetElapsed":160596,
>>>>>>> "documentReads":959211,
>>>>>>> "documentAnalyzedCompatibleClass":959211,
>>>>>>> "recordReads":959211,
>>>>>>> "elapsed":160596.25,
>>>>>>> "resultType":"collection",
>>>>>>> "resultSize":1,
>>>>>>>
>>>>>>> "@fieldTypes":"involvedIndexes=e,fetchingFromTargetElapsed=l,documentReads=l,documentAnalyzedCompatibleClass=l,recordReads=l,elapsed=f"
>>>>>>> }
>>>>>>>
>>>>>>> As you can see, even OrientDB used the fieldA index, it still costs
>>>>>>> 16 seconds to query a million records, it is unacceptable.
>>>>>>>
>>>>>>> Is there any good way to make this query faster?
>>>>>>>
>>>>>>> https://stackoverflow.com/questions/22896528/embedded-list-
>>>>>>> query-performance-in-orientdb
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> ---
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "OrientDB" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to [email protected].
>>>>>>>
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>> Andrey Lomakin.
>>>>>>
>>>>>> Orient Technologies
>>>>>> the Company behind OrientDB
>>>>>>
>>>>>> --
>>>>>
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "OrientDB" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Andrey Lomakin.
>>>>
>>>> Orient Technologies
>>>> the Company behind OrientDB
>>>>
>>>> --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Best regards,
>> Andrey Lomakin.
>>
>> Orient Technologies
>> the Company behind OrientDB
>>
>>
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.