Re: [orientdb] Re: AWS performance question

Curt Kohler Mon, 27 Jun 2016 07:14:41 -0700

Luca,

Thanks for taking the time to reply. In answer to your question, yes and 
no. I was running the ETL tool on a instance that only had 2 cores, so 
there was really only one core available for the tool to utilize(hence the 
150/sec for one thread result).  I actually wrote a simple Spark-based 
loading program (using OrientGraphNoTx and setting intent for massive 
insert) and ran it as a job on my AWS Spark cluster for easily controllable 
parallelization. I was able to run up to 8 worker nodes (basically 8 
threads) before I started seeing exceptions come back from the calls for a 
load rate of @ 1,042 recs/sec (approx 130 rec/sec/thread).  I should note 
this rate was for creating edges between existing vertices from a file that 
had our internal ids for the nodes. The code had to look up the RIDs based 
on those keys (which had an index on them) and then create the link 
(basically the same work that our ETL config file was set up to do on our 
earlier runs).


 Glad to hear that you are going to provide some guidance on cloud 
deployment recommendations.  Having that type of info would have been 
helpful during this exercise.

Curt



On Friday, June 24, 2016 at 12:33:52 PM UTC-4, l.garulli wrote:
>
> Hi guys,
>
> A couple of week ago we created an internal division in OrientDB to take 
> care about AWS (and other Cloud). Soon we will publish some metrics about 
> OrientDB and Amazon AWS server configurations, so it will much easier 
> choosing the right hw/sw configuration for your workload.
>
> Back to your first question, I think the ETL is slow because it goes not 
> in parallel. Have you tried "parallel" option?
>
>
> Best Regards,
>
> Luca Garulli
> Founder & CEO
> OrientDB LTD <http://orientdb.com/>
>
>
> On 24 June 2016 at 10:57, Curt Kohler <[email protected] <javascript:>> 
> wrote:
>
>> Sorry, I should have been more explicit..  I moved over to the r3 
>> instance types to leverage the attached SSD ephemeral drives instead of the 
>> networked EBS drive to take possible network issues out of the picture.... 
>> I didn't notice anything specific in iostats when running the loads.
>>
>>
>> On Tuesday, June 21, 2016 at 12:12:33 AM UTC-4, Francisco Reyes wrote:
>>>
>>> On Monday, June 20, 2016 at 9:56:11 AM UTC-4, Curt Kohler wrote:
>>>>
>>>> eventually solved the issue). When we were finally able to run the 
>>>> files successfully, we were seeing throughput in the rand of @ 150 
>>>> edges/sec (running with one thread). 
>>>>
>>>
>>> Curt,
>>>
>>> New OrientDB user here.. but was wondering if you checked iostats to see 
>>> if it was an issue with the disk subsystem. Also, is the disk SSD? Is disk 
>>> using provisioned IOPS?
>>>
>> -- 
>>
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [orientdb] Re: AWS performance question

Reply via email to