Looks this problem is not attracting many concerns.

That's fine, I'll change the schema of the database, and make CAS more 
important field.

I'll try to find out the problem in the index from the code when done this 
work.

On Monday, April 21, 2014 1:12:09 AM UTC+8, Wise Jack wrote:
>
> Hi, Lvc:
>
> Here is the result:
>
> orientdb {compounds}> select count(*) from Compound where '000075-72-9' in 
> cas
>
> ----+-----+-----
> #   |@RID |count
> ----+-----+-----
> 0   |#-2:0|1
> ----+-----+-----
>
> 1 item(s) found. Query executed in 222.285 sec(s).
>
> On Sunday, April 20, 2014 4:38:59 PM UTC+8, Lvc@ wrote:
>>
>> Hi Jack,
>>
>> if you execute this: 
>>
>> select count(*) from Compound where '000075-72-9' in cas
>>
>> How many records are retrieved? How much time does it take?
>>
>> Lvc@
>>
>>
>>
>> On 20 April 2014 07:07, Wise Jack <[email protected]> wrote:
>>
>>> Has anyone got a suggestion?
>>>
>>>
>>> On Tuesday, April 15, 2014 10:00:19 AM UTC+8, Wise Jack wrote:
>>>>
>>>> Hi, Andrey:
>>>>
>>>> The result is the same, unacceptable slow, here is the explain:
>>>>
>>>> orientdb {compounds}> explain select * from Compound where 
>>>> '000075-72-9' in cas
>>>>
>>>> Profiled command '{involvedIndexes:[1],current:#11:960477,
>>>> fetchingFromTargetElapsed:327390,documentReads:959211,
>>>> documentAnalyzedCompatibleClass:959211,recordReads:959211,
>>>> elapsed:327501.62,resultType:collection,resultSize:1}' in 327.528992 
>>>> sec(s):
>>>> {"@type":"d","@version":0,"involvedIndexes":["Compound.
>>>> cas"],"current":"#11:960477","fetchingFromTargetElapsed":
>>>> 327390,"documentReads":959211,"documentAnalyzedCompatibleClas
>>>> s":959211,"recordReads":959211,"elapsed":327501.62,"
>>>> resultType":"collection","resultSize":1,"@fieldTypes":"
>>>> involvedIndexes=e,fetchingFromTargetElapsed=l,documentReads=l,
>>>> documentAnalyzedCompatibleClass=l,recordReads=l,elapsed=f"}
>>>>
>>>> On Friday, April 11, 2014 6:15:39 PM UTC+8, Andrey Lomakin wrote:
>>>>>
>>>>> HI,
>>>>> Could you try now ?
>>>>>
>>>>>
>>>>> On Tue, Apr 8, 2014 at 5:13 PM, Wise Jack <[email protected]> wrote:
>>>>>
>>>>>> Hi, Andrey:
>>>>>>
>>>>>> Sure. I'll send you a sample document of the database, I can't send 
>>>>>> the whole database to you since it's too large:
>>>>>>
>>>>>> This is a sample record of the database, I'm immigrating a chemical 
>>>>>> compounds database from MySQL to OrientDB.
>>>>>> --------------------------------------------------
>>>>>> ODocument - Class: Compound   id: #11:5111   v.1
>>>>>> --------------------------------------------------
>>>>>>       iupac_cas_name : chloro(trifluoro)methane
>>>>>>          create_date : Sat Jan 17 00:00:00 CST 1970
>>>>>> iupac_traditional_name : chloro(trifluoro)methane
>>>>>> cactvs_hbond_acceptor : 3
>>>>>>      component_count : 1
>>>>>>   cactvs_tauto_count : 1
>>>>>>      nonstandardbond : null
>>>>>>     molecular_weight : 104.45891
>>>>>>      coordinate_type : 1
>>>>>> 5
>>>>>> 255
>>>>>>  monoisotopic_weight : 103.964066
>>>>>>       iupac_inchikey : AFYPFACVUDMOHA-UHFFFAOYSA-N
>>>>>>           exact_mass : 103.964066
>>>>>>               xlogp3 : 2.0
>>>>>>           iupac_name : chloro(trifluoro)methane
>>>>>>   openeye_iso_smiles : C(F)(F)(F)Cl
>>>>>> compound_canonicalized : 1
>>>>>>  isotopic_atom_count : 0
>>>>>>      cactvs_subskeys : AAADcQAAAYAEAAAAAAAAAAAAAAAAAA
>>>>>> AAAAAAAAAAAAAAAAAAAAAAAQIAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAA
>>>>>> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
>>>>>> atom_udef_stereo_count : 0
>>>>>>    cactvs_complexity : 28
>>>>>> iupac_systematic_name : chloranyl-tris(fluoranyl)methane
>>>>>> bond_udef_stereo_count : 0
>>>>>> bond_def_stereo_count : 0
>>>>>>   cactvs_hbond_donor : 0
>>>>>>      bondannotations : undefined
>>>>>>          cactvs_tpsa : 0
>>>>>>                  cas : [75-72-9, 185009-43-2
>>>>>> 75-72-9, 50815-73-1, 000075-72-9, 185009-43-2, 4-01-00-00034 
>>>>>> (Beilstein Handbook Reference)]
>>>>>>   openeye_can_smiles : C(F)(F)(F)Cl
>>>>>>     heavy_atom_count : 5
>>>>>>   iupac_openeye_name : chloro(trifluoro)methane
>>>>>>          iupac_inchi : InChI=1S/CClF3/c2-1(3,4)5
>>>>>>          modify_date : Sat Jan 17 00:00:00 CST 1970
>>>>>>    molecular_formula : CClF3
>>>>>>         total_charge : 0
>>>>>>         compound_cid : 6392
>>>>>> atom_def_stereo_count : 0
>>>>>> cactvs_rotatable_bond : 0
>>>>>>
>>>>>> The embedded list field is the CAS field.
>>>>>>
>>>>>> The schema of Class Compound is as the attachment.
>>>>>>
>>>>>> On Tuesday, April 8, 2014 9:09:30 PM UTC+8, Andrey Lomakin wrote:
>>>>>>
>>>>>>> Could you provide database sample ?
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Apr 8, 2014 at 8:51 AM, Wise Jack <[email protected]>wrote:
>>>>>>>
>>>>>>>> Hi, Andrey.
>>>>>>>>
>>>>>>>> Thanks for your reply. The memory information is as below:
>>>>>>>>
>>>>>>>> [root@root ~]# cat /proc/meminfo
>>>>>>>> MemTotal:        8063160 kB
>>>>>>>> MemFree:          228968 kB
>>>>>>>>
>>>>>>>> As you can see 
>>>>>>>>
>>>>>>>>      "involvedIndexes":["ClassA.fieldA"],
>>>>>>>>      "current":"#11:960477",
>>>>>>>>      "fetchingFromTargetElapsed":160596,
>>>>>>>>      "documentReads":959211,
>>>>>>>>
>>>>>>>> Even the database can see the index, but it still iterate all the 
>>>>>>>> documents in the database, I think that's the reason for the slow.
>>>>>>>>
>>>>>>>> The same data in mysql(that using fieldA's index), can return data 
>>>>>>>> in 0.015second, so I think this is not the fault of the data, maybe 
>>>>>>>> there 
>>>>>>>> is a better way for creating index or querying using index for 
>>>>>>>> embedded 
>>>>>>>> list of OrientDB.
>>>>>>>>
>>>>>>>> On Monday, April 7, 2014 5:25:27 PM UTC+8, Andrey Lomakin wrote:
>>>>>>>>
>>>>>>>>> Yes too slow.
>>>>>>>>> What amount of RAM do you have ?-
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Apr 7, 2014 at 5:33 AM, Wise Jack <[email protected]>wrote:
>>>>>>>>>
>>>>>>>>>> I'm testing orientdb for a storage database of a knowledge base.
>>>>>>>>>>
>>>>>>>>>> The database can be something like this:
>>>>>>>>>>
>>>>>>>>>> [
>>>>>>>>>>     {
>>>>>>>>>>         fieldA: ['a','b','c']
>>>>>>>>>>     },
>>>>>>>>>>     {
>>>>>>>>>>         fieldA: ['c','d','e']
>>>>>>>>>>     },
>>>>>>>>>> ]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> and the query is something like this:
>>>>>>>>>>
>>>>>>>>>> select from ClassA where 'c' in fieldA
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The query is very very slow, the explain of the query is as below
>>>>>>>>>>
>>>>>>>>>> {
>>>>>>>>>>     "@type":"d","@version":0,
>>>>>>>>>>      "involvedIndexes":["ClassA.fieldA"],
>>>>>>>>>>      "current":"#11:960477",
>>>>>>>>>>      "fetchingFromTargetElapsed":160596,
>>>>>>>>>>      "documentReads":959211,
>>>>>>>>>>      "documentAnalyzedCompatibleClass":959211,
>>>>>>>>>>      "recordReads":959211,
>>>>>>>>>>      "elapsed":160596.25,
>>>>>>>>>>      "resultType":"collection",
>>>>>>>>>>      "resultSize":1,
>>>>>>>>>>      
>>>>>>>>>> "@fieldTypes":"involvedIndexes=e,fetchingFromTargetElapsed=l,documentReads=l,documentAnalyzedCompatibleClass=l,recordReads=l,elapsed=f"
>>>>>>>>>>  }
>>>>>>>>>>
>>>>>>>>>> As you can see, even OrientDB used the fieldA index, it still 
>>>>>>>>>> costs 16 seconds to query a million records, it is unacceptable.
>>>>>>>>>>
>>>>>>>>>> Is there any good way to make this query faster?
>>>>>>>>>>
>>>>>>>>>> https://stackoverflow.com/questions/22896528/embedded-list-q
>>>>>>>>>> uery-performance-in-orientdb
>>>>>>>>>>  
>>>>>>>>>> -- 
>>>>>>>>>>
>>>>>>>>>> --- 
>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>> Google Groups "OrientDB" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>>> send an email to [email protected].
>>>>>>>>>>
>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -- 
>>>>>>>>> Best regards,
>>>>>>>>> Andrey Lomakin.
>>>>>>>>>
>>>>>>>>> Orient Technologies
>>>>>>>>> the Company behind OrientDB
>>>>>>>>>
>>>>>>>>>   -- 
>>>>>>>>
>>>>>>>> --- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "OrientDB" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to [email protected].
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> Best regards,
>>>>>>> Andrey Lomakin.
>>>>>>>
>>>>>>> Orient Technologies
>>>>>>> the Company behind OrientDB
>>>>>>>
>>>>>>>   -- 
>>>>>>
>>>>>> --- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "OrientDB" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to [email protected].
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> Best regards,
>>>>> Andrey Lomakin.
>>>>>
>>>>> Orient Technologies
>>>>> the Company behind OrientDB
>>>>>
>>>>>   -- 
>>>
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to