I also run into very similar problem. I currently only have 10 documents in 
a class "Company". Each document has a name field, and a lot of other 
fields + sub-fields + sub-sub-fields. Some field including arrays of 
sub-document, which include huge data. So the JSON of one looks like
{ "name":"CompanyA", "employee":[{"name":"John", 
"department":"HR"}, {"name":"Tom", "department":"Accounting"}, 
..., {"name":"Joe", "department":"Security"}], .... }, 
where the "employee" field is an array of high dimension (say the data 
under "employee" is ~100MB).

Now I'm trying to do "Select name from Company", it takes me ~20 second to 
get the 10 company names. But without "employee", it's less than 1 sec.

I'm a little surprised by the time difference. To me "Select name from 
Company" has nothing to do with the field "employee". Why the existence of 
this bulk of data influences the performance of this query so much?

Thank you! 

On Saturday, December 13, 2014 at 2:24:34 PM UTC-6, Lvc@ wrote:
>
> Hi BK,
> Could you try with OrientDB 2.0-SNAPSHOT?
>
> I don't know why is so slow with only 16 records. Maybe is the toJSON() 
> function?
>
> How does it takes only this?
>
> select @rid as id, ifnull(name, "") as name from RawData
>
> Lvc@
>
>
> On 12 December 2014 at 20:37, BK <[email protected] <javascript:>> 
> wrote:
>>
>> I've gone ahead and moved the large data field into a separate, linked 
>> record. I'd still appreciate any feedback on whether the behavior described 
>> below is expected, and/or the correct way to accomplish the described goal 
>> given the original record structure.
>>
>>
>> On Tuesday, December 9, 2014 10:49:24 PM UTC-5, BK wrote:
>>>
>>> I have a class, RawData, with records containing a *name *field and a *data 
>>> *field (each data field holds 50+ MB of text). I'm trying to produce a 
>>> fast query to retrieve just the RIDs and *name* fields for all RawData 
>>> records.  (The goal is to give users a summary view of the names and RIDs 
>>> so that they can retrieve raw data by RID). 
>>>
>>> Unfortunately, the queries I've tried are running very slowly and seem 
>>> to be including/processing the large *data *field.  Ideally, I'm 
>>> looking for a fast Java API query that will return something along the 
>>> lines of a JSONArray of the form:
>>> [ {  "id": "#13:0", "name": "foo" }, { "id": "#13:1", "name": "bar" } ]
>>> which I could then use directly as a JSONArray in Java.
>>>
>>> In the console, these queries seem to produce what I'm looking for, but 
>>> they're both too slow. With 16 records, the queries take about 25sec to run 
>>> in the console on a development machine (the fetchplan doesn't seem to 
>>> matter):
>>> select set(@this.toJSON('id,name')) from ( select @rid as id, ifnull(
>>> name, "") as name from RawData )
>>> select set(@this.toJSON('id,name')) from ( select @rid as id, ifnull(
>>> name, "") as name from RawData fetchPlan data:-2 )
>>>
>>> I'm using version 1.7.9.
>>>
>>> Is there a better/faster query for the results? Do I need to move the 
>>> large data field into a separate, linked record?
>>>
>>  -- 
>>
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to