Hi, Andrey:

Sure. I'll send you a sample document of the database, I can't send the 
whole database to you since it's too large:

This is a sample record of the database, I'm immigrating a chemical 
compounds database from MySQL to OrientDB.
--------------------------------------------------
ODocument - Class: Compound   id: #11:5111   v.1
--------------------------------------------------
      iupac_cas_name : chloro(trifluoro)methane
         create_date : Sat Jan 17 00:00:00 CST 1970
iupac_traditional_name : chloro(trifluoro)methane
cactvs_hbond_acceptor : 3
     component_count : 1
  cactvs_tauto_count : 1
     nonstandardbond : null
    molecular_weight : 104.45891
     coordinate_type : 1
5
255
 monoisotopic_weight : 103.964066
      iupac_inchikey : AFYPFACVUDMOHA-UHFFFAOYSA-N
          exact_mass : 103.964066
              xlogp3 : 2.0
          iupac_name : chloro(trifluoro)methane
  openeye_iso_smiles : C(F)(F)(F)Cl
compound_canonicalized : 1
 isotopic_atom_count : 0
     cactvs_subskeys : 
AAADcQAAAYAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQIAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
atom_udef_stereo_count : 0
   cactvs_complexity : 28
iupac_systematic_name : chloranyl-tris(fluoranyl)methane
bond_udef_stereo_count : 0
bond_def_stereo_count : 0
  cactvs_hbond_donor : 0
     bondannotations : undefined
         cactvs_tpsa : 0
                 cas : [75-72-9, 185009-43-2
75-72-9, 50815-73-1, 000075-72-9, 185009-43-2, 4-01-00-00034 (Beilstein 
Handbook Reference)]
  openeye_can_smiles : C(F)(F)(F)Cl
    heavy_atom_count : 5
  iupac_openeye_name : chloro(trifluoro)methane
         iupac_inchi : InChI=1S/CClF3/c2-1(3,4)5
         modify_date : Sat Jan 17 00:00:00 CST 1970
   molecular_formula : CClF3
        total_charge : 0
        compound_cid : 6392
atom_def_stereo_count : 0
cactvs_rotatable_bond : 0

The embedded list field is the CAS field.

The schema of Class Compound is as the attachment.

On Tuesday, April 8, 2014 9:09:30 PM UTC+8, Andrey Lomakin wrote:
>
> Could you provide database sample ?
>
>
> On Tue, Apr 8, 2014 at 8:51 AM, Wise Jack <[email protected]<javascript:>
> > wrote:
>
>> Hi, Andrey.
>>
>> Thanks for your reply. The memory information is as below:
>>
>> [root@root ~]# cat /proc/meminfo
>> MemTotal:        8063160 kB
>> MemFree:          228968 kB
>>
>> As you can see 
>>
>>      "involvedIndexes":["ClassA.fieldA"],
>>      "current":"#11:960477",
>>      "fetchingFromTargetElapsed":160596,
>>      "documentReads":959211,
>>
>> Even the database can see the index, but it still iterate all the 
>> documents in the database, I think that's the reason for the slow.
>>
>> The same data in mysql(that using fieldA's index), can return data in 
>> 0.015second, so I think this is not the fault of the data, maybe there is a 
>> better way for creating index or querying using index for embedded list of 
>> OrientDB.
>>
>> On Monday, April 7, 2014 5:25:27 PM UTC+8, Andrey Lomakin wrote:
>>
>>> Yes too slow.
>>> What amount of RAM do you have ?-
>>>
>>>
>>> On Mon, Apr 7, 2014 at 5:33 AM, Wise Jack <[email protected]> wrote:
>>>
>>>> I'm testing orientdb for a storage database of a knowledge base.
>>>>
>>>> The database can be something like this:
>>>>
>>>> [
>>>>     {
>>>>         fieldA: ['a','b','c']
>>>>     },
>>>>     {
>>>>         fieldA: ['c','d','e']
>>>>     },
>>>> ]
>>>>
>>>>
>>>> and the query is something like this:
>>>>
>>>> select from ClassA where 'c' in fieldA
>>>>
>>>>
>>>> The query is very very slow, the explain of the query is as below
>>>>
>>>> {
>>>>     "@type":"d","@version":0,
>>>>      "involvedIndexes":["ClassA.fieldA"],
>>>>      "current":"#11:960477",
>>>>      "fetchingFromTargetElapsed":160596,
>>>>      "documentReads":959211,
>>>>      "documentAnalyzedCompatibleClass":959211,
>>>>      "recordReads":959211,
>>>>      "elapsed":160596.25,
>>>>      "resultType":"collection",
>>>>      "resultSize":1,
>>>>      
>>>> "@fieldTypes":"involvedIndexes=e,fetchingFromTargetElapsed=l,documentReads=l,documentAnalyzedCompatibleClass=l,recordReads=l,elapsed=f"
>>>>  }
>>>>
>>>> As you can see, even OrientDB used the fieldA index, it still costs 16 
>>>> seconds to query a million records, it is unacceptable.
>>>>
>>>> Is there any good way to make this query faster?
>>>>
>>>> https://stackoverflow.com/questions/22896528/embedded-
>>>> list-query-performance-in-orientdb
>>>>  
>>>> -- 
>>>>
>>>> --- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "OrientDB" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>>
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> -- 
>>> Best regards,
>>> Andrey Lomakin.
>>>
>>> Orient Technologies
>>> the Company behind OrientDB
>>>
>>>   -- 
>>
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Best regards,
> Andrey Lomakin.
>
> Orient Technologies
> the Company behind OrientDB
>
>  

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.
OrientDB console v.1.7-rc2 (build UNKNOWN@r; 2014-03-25 15:54:54+0100) 
www.orientechnologies.com
Type 'help' to display all the commands supported.
Installing extensions for GREMLIN language v.2.5.0-SNAPSHOT

orientdb> connect remote:localhost/compounds admin admin
Connecting to database [remote:localhost/compounds] with user 'admin'...OK
orientdb {compounds}> desc Compound

Class................: Compound
Super class..........: V
Default cluster......: compound (id=11)
Supported cluster ids: [11]
Properties:
-------------------------------+-------------+-------------------------------+-----------+----------+----------+-----------+-----------+----------+
 NAME                          | TYPE        | LINKED TYPE/CLASS             | 
MANDATORY | READONLY | NOT NULL |    MIN    |    MAX    | COLLATE  |
-------------------------------+-------------+-------------------------------+-----------+----------+----------+-----------+-----------+----------+
 iupac_inchikey                | STRING      | null                          | 
false     | false    | false    |           |           | default  |
 cactvs_hbond_donor            | INTEGER     | null                          | 
false     | false    | false    |           |           | default  |
 openeye_iso_smiles            | STRING      | null                          | 
false     | false    | false    |           |           | default  |
 total_charge                  | INTEGER     | null                          | 
false     | false    | false    |           |           | default  |
 bond_def_stereo_count         | INTEGER     | null                          | 
false     | false    | false    |           |           | default  |
 cactvs_tauto_count            | INTEGER     | null                          | 
false     | false    | false    |           |           | default  |
 cactvs_complexity             | INTEGER     | null                          | 
false     | false    | false    |           |           | default  |
 iupac_openeye_name            | STRING      | null                          | 
false     | false    | false    |           |           | default  |
 bond_udef_stereo_count        | INTEGER     | null                          | 
false     | false    | false    |           |           | default  |
 atom_def_stereo_count         | INTEGER     | null                          | 
false     | false    | false    |           |           | default  |
 atom_udef_stereo_count        | INTEGER     | null                          | 
false     | false    | false    |           |           | default  |
 nonstandardbond               | STRING      | null                          | 
false     | false    | false    |           |           | default  |
 compound_cid                  | LONG        | null                          | 
false     | false    | false    |           |           | default  |
 heavy_atom_count              | INTEGER     | null                          | 
false     | false    | false    |           |           | default  |
 coordinate_type               | STRING      | null                          | 
false     | false    | false    |           |           | default  |
 cactvs_subskeys               | STRING      | null                          | 
false     | false    | false    |           |           | default  |
 molecular_weight              | DOUBLE      | null                          | 
false     | false    | false    |           |           | default  |
 component_count               | INTEGER     | null                          | 
false     | false    | false    |           |           | default  |
 modify_date                   | DATE        | null                          | 
false     | false    | false    |           |           | default  |
 cactvs_rotatable_bond         | INTEGER     | null                          | 
false     | false    | false    |           |           | default  |
 molecular_formula             | STRING      | null                          | 
false     | false    | false    |           |           | default  |
 bondannotations               | STRING      | null                          | 
false     | false    | false    |           |           | default  |
 exact_mass                    | DOUBLE      | null                          | 
false     | false    | false    |           |           | default  |
 isotopic_atom_count           | INTEGER     | null                          | 
false     | false    | false    |           |           | default  |
 cactvs_tpsa                   | INTEGER     | null                          | 
false     | false    | false    |           |           | default  |
 iupac_cas_name                | STRING      | null                          | 
false     | false    | false    |           |           | default  |
 cactvs_hbond_acceptor         | INTEGER     | null                          | 
false     | false    | false    |           |           | default  |
 monoisotopic_weight           | DOUBLE      | null                          | 
false     | false    | false    |           |           | default  |
 openeye_can_smiles            | STRING      | null                          | 
false     | false    | false    |           |           | default  |
 iupac_traditional_name        | STRING      | null                          | 
false     | false    | false    |           |           | default  |
 cas                           | EMBEDDEDLIST| STRING                        | 
false     | false    | false    |           |           | default  |
 compound_canonoicalized       | LONG        | null                          | 
false     | false    | false    |           |           | default  |
 iupac_name                    | STRING      | null                          | 
false     | false    | false    |           |           | default  |
 iupac_inchi                   | STRING      | null                          | 
false     | false    | false    |           |           | default  |
 create_date                   | DATE        | null                          | 
false     | false    | false    |           |           | default  |
 iupac_systematic_name         | STRING      | null                          | 
false     | false    | false    |           |           | default  |
 xlogp3                        | DOUBLE      | null                          | 
false     | false    | false    |           |           | default  |
-------------------------------+-------------+-------------------------------+-----------+----------+----------+-----------+-----------+----------+
Indexes (4 altogether):
-------------------------------+----------------+
 NAME                          | PROPERTIES     |
-------------------------------+----------------+
 Compound.iupac_traditional_name| iupac_traditional_name|
 Compound.iupac_name           | iupac_name     |
 Compound.cas                  | cas            |
 Compound.compound_cid         | compound_cid   |
-------------------------------+----------------+

Reply via email to