RE: Slow Data Insertion On Large Cache : Spark Streaming

2018-11-12 Thread ApacheUser
Thanks Stan,
planning to move on to 2.7.

Thanks



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


RE: Slow Data Insertion On Large Cache : Spark Streaming

2018-11-12 Thread Stanislav Lukyanov
Hi,

Do you use persistence? Do you have more data on disk than RAM size?
If yes, it’s almost definitely 
https://issues.apache.org/jira/browse/IGNITE-9519.
If no, it still can be the same issue.
Try running on 2.7, it should be released soon.

Stan

From: ApacheUser
Sent: 5 ноября 2018 г. 20:10
To: user@ignite.apache.org
Subject: Slow Data Insertion On Large Cache : Spark Streaming

Hi Team,

We have 6 node Ignite cluster with 72CPU, 256GB  RAM and 5TB Storage . Data
ingested using Spark Streaming  into Ignite Cluster for SQL and Tableau
Usage.

I have couple of Large tables with 200ml rows with (200GB) and 800ml rows
with (500GB)  .
The insertion is taking more than 40secs if there is already existing
Composite key, if new row its around 10ms.

We have Entry, Main and Details tables, "Entry" cache has single field "id"
primary key, second cache "Main"  is with composite Primary key "id" and
"mainid" third Cache "Details" with composite primary key "id","mainrid" and
"detailid". "id" is the affinity key for all and some other small tables.

1. Is there any performance of insertion/updation diffeence  for  single
field primary key vs multi field primary key?
 will it make any differenc if I convert composite primary key as singe
field primary Key?
  like  concatanate all composite fields and make sigle filed primary key?

2.what are ignite.sh and Config parameters needs tuning?

My Spark Dataframe save options (Save to Ignite)

 .option(OPTION_STREAMER_ALLOW_OVERWRITE, true)
.mode(SaveMode.Append)
.save()

My Ignite.sh

JVM_OPTS="-server -Xms10g -Xmx10g -XX:+AggressiveOpts
-XX:MaxMetaspaceSize=512m"
JVM_OPTS="${JVM_OPTS} -XX:+AlwaysPreTouch"
JVM_OPTS="${JVM_OPTS} -XX:+UseG1GC"
JVM_OPTS="${JVM_OPTS} -XX:+ScavengeBeforeFullGC"
JVM_OPTS="${JVM_OPTS} -XX:+DisableExplicitGC"
JVM_OPTS="${JVM_OPTS} -XX:+HeapDumpOnOutOfMemoryError "
JVM_OPTS="${JVM_OPTS} -XX:HeapDumpPath=${IGNITE_HOME}/work"
JVM_OPTS="${JVM_OPTS} -XX:+PrintGCDetails"
JVM_OPTS="${JVM_OPTS} -XX:+PrintGCTimeStamps"
JVM_OPTS="${JVM_OPTS} -XX:+PrintGCDateStamps"
JVM_OPTS="${JVM_OPTS} -XX:+UseGCLogFileRotation"
JVM_OPTS="${JVM_OPTS} -XX:NumberOfGCLogFiles=10"
JVM_OPTS="${JVM_OPTS} -XX:GCLogFileSize=100M"
JVM_OPTS="${JVM_OPTS} -Xloggc:${IGNITE_HOME}/work/gc.log"
JVM_OPTS="${JVM_OPTS} -XX:+PrintAdaptiveSizePolicy"
JVM_OPTS="${JVM_OPTS} -XX:MaxGCPauseMillis=100"

export IGNITE_SQL_FORCE_LAZY_RESULT_SET=true

default-Config.xml






http://www.springframework.org/schema/beans;
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
   xsi:schemaLocation="
   http://www.springframework.org/schema/beans
   http://www.springframework.org/schema/beans/spring-beans.xsd;>







 



   



 





 






  











  
   











  
  

















  


  

  

  


  
  
   

  

 
 

  
  


Slow Data Insertion On Large Cache : Spark Streaming

2018-11-05 Thread ApacheUser
Hi Team,

We have 6 node Ignite cluster with 72CPU, 256GB  RAM and 5TB Storage . Data
ingested using Spark Streaming  into Ignite Cluster for SQL and Tableau
Usage.

I have couple of Large tables with 200ml rows with (200GB) and 800ml rows
with (500GB)  .
The insertion is taking more than 40secs if there is already existing
Composite key, if new row its around 10ms.

We have Entry, Main and Details tables, "Entry" cache has single field "id"
primary key, second cache "Main"  is with composite Primary key "id" and
"mainid" third Cache "Details" with composite primary key "id","mainrid" and
"detailid". "id" is the affinity key for all and some other small tables.

1. Is there any performance of insertion/updation diffeence  for  single
field primary key vs multi field primary key?
 will it make any differenc if I convert composite primary key as singe
field primary Key?
  like  concatanate all composite fields and make sigle filed primary key?

2.what are ignite.sh and Config parameters needs tuning?

My Spark Dataframe save options (Save to Ignite)

 .option(OPTION_STREAMER_ALLOW_OVERWRITE, true)
.mode(SaveMode.Append)
.save()

My Ignite.sh

JVM_OPTS="-server -Xms10g -Xmx10g -XX:+AggressiveOpts
-XX:MaxMetaspaceSize=512m"
JVM_OPTS="${JVM_OPTS} -XX:+AlwaysPreTouch"
JVM_OPTS="${JVM_OPTS} -XX:+UseG1GC"
JVM_OPTS="${JVM_OPTS} -XX:+ScavengeBeforeFullGC"
JVM_OPTS="${JVM_OPTS} -XX:+DisableExplicitGC"
JVM_OPTS="${JVM_OPTS} -XX:+HeapDumpOnOutOfMemoryError "
JVM_OPTS="${JVM_OPTS} -XX:HeapDumpPath=${IGNITE_HOME}/work"
JVM_OPTS="${JVM_OPTS} -XX:+PrintGCDetails"
JVM_OPTS="${JVM_OPTS} -XX:+PrintGCTimeStamps"
JVM_OPTS="${JVM_OPTS} -XX:+PrintGCDateStamps"
JVM_OPTS="${JVM_OPTS} -XX:+UseGCLogFileRotation"
JVM_OPTS="${JVM_OPTS} -XX:NumberOfGCLogFiles=10"
JVM_OPTS="${JVM_OPTS} -XX:GCLogFileSize=100M"
JVM_OPTS="${JVM_OPTS} -Xloggc:${IGNITE_HOME}/work/gc.log"
JVM_OPTS="${JVM_OPTS} -XX:+PrintAdaptiveSizePolicy"
JVM_OPTS="${JVM_OPTS} -XX:MaxGCPauseMillis=100"

export IGNITE_SQL_FORCE_LAZY_RESULT_SET=true

default-Config.xml






http://www.springframework.org/schema/beans;
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
   xsi:schemaLocation="
   http://www.springframework.org/schema/beans
   http://www.springframework.org/schema/beans/spring-beans.xsd;>







 



   



 





 






  











  
   











  
  

















  


  

  

  


  
  
   

  

 
 

  
  

  64.x.x.x:47500..47509
  64.x.x.x:47500..47509
  64.x.x.x:47500..47509
  64.x.x.x:47500..47509
 64.x.x.x:47500..47509
 64.x.x.x:47500..47509

  

  

  

  





Thanks





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/