Re: Multiple servers in a Ignite Cluster
Hi Vlad, The ideal work flow for my use case is: I host two clusters, one is computation cluster that run Spark jobs, the other is data cluster that host Ignite node and cache hot data. Then at the run time, multiple Spark jobs share this data cluster and query it. The problem I have is, I am pre-loading Ignite cache using a Spark job. Once IgniteContext got instantiated, it will launch Ignite Node with same number of Spark executors I allocated. Then distributed cache will distribute data on those nodes within my computation cluster as well, which I don't want because partial data hosted by these nodes won't be there once my pre-load job dies. Currently I force these node to be client mode so that the cache only distributed to data cluster when I execute my pre-load job. Are there better way to solve this? Thanks, Tracy -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Multiple-servers-in-a-Ignite-Cluster-tp8840p8887.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Multiple servers in a Ignite Cluster
Thanks. Works fine now. -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Multiple-servers-in-a-Ignite-Cluster-tp8840p8851.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Multiple servers in a Ignite Cluster
Thanks. In that case, my question is how to define the scope of cluster(Or how to specify the cluster a server belongs to)? I assume if someone else start a ignite node, would my ignite server auto-discover it as well? -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Multiple-servers-in-a-Ignite-Cluster-tp8840p8844.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Multiple servers in a Ignite Cluster
I have following ignite config: def initializeIgniteConfig() = { val ipFinder = new TcpDiscoveryVmIpFinder() val HOST = "xx.xx.xx.xx:47500..47509" ipFinder.setAddresses(Collections.singletonList(HOST)) val discoverySpi = new TcpDiscoverySpi() discoverySpi.setIpFinder(ipFinder) val igniteConfig = new IgniteConfiguration() igniteConfig.setDiscoverySpi(discoverySpi) //Ignite uses work directory as a relative directory for internal write activities, for example, logging. // Every Ignite node (server or client) has it's own work directory independently of other nodes. igniteConfig.setWorkDirectory("/tmp") igniteConfig } My use case is: I would like to have multiple ignite servers, each server cache a subset of the data and I send distributed closures to each node to do local computation. In this case, can I start multiple servers on one single machine by just passing multiple ip and port? Or I will need to start each server on each machine separately? Thanks in advance! -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Multiple-servers-in-a-Ignite-Cluster-tp8840.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: What's the difference between EntryProcessor and distributed closure?
Thanks Alexey. By predicate/projection pushdown, I mean: currently I am storing a native Spark Row object as value format of IgniteCache. If I retrieve it as an IgniteRDD, I only want certain column of that Row object rather than returning entire Row and do filter/projection at Spark level. Do you have better recommendations to achieve this? Tracy -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/What-s-the-difference-between-EntryProcessor-and-distributed-closure-tp8759p8790.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
What's the difference between EntryProcessor and distributed closure?
What I would like to do is to achieve predicate/column project push-down to the ignite cache layer. I guess this two options could do it, isn't it? If so, what's the difference? Are there any other options to achieve predicate push-down? Thanks in advance! -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/What-s-the-difference-between-EntryProcessor-and-distributed-closure-tp8759.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Is it possible to enable both REPLICATED and PARTITIONED?
As subject shows. -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Is-it-possible-to-enable-both-REPLICATED-and-PARTITIONED-tp8167.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Fail to cache rdd: java.lang.NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
Hi Denis, This is really helpful. Yes, I need the original dataframe for other API. Now I am using RDD[String, Row] as type and caching dataframe using: val rdd = df.map(row => (row.getAs[String]("KEY"), row)) igniteRDD.savePairs(rdd) It works perfectly fine. Also I was able to reconstruct the dataframe using values collection and schema of rdd. Tracy -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Fail-to-cache-rdd-java-lang-NoSuchMethodError-scala-Predef-conforms-Lscala-Predef-less-colon-less-tp8123p8128.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Fail to cache rdd: java.lang.NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
Thanks for prompt reply. So if I want to cache dataframe in IgniteCache, I have to do define a custom data model class(e.g. https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/model/Person.java ) as a schema of dataframe, then construct objects and declare cache to be data model type? In other words, I have to do DataFrame => Custom class object => IgniteRDD. And when I retrieve it, I have to do IgniteRDD => Custom class object => DataFrame, right? Do I have other options for caching dataframe? -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Fail-to-cache-rdd-java-lang-NoSuchMethodError-scala-Predef-conforms-Lscala-Predef-less-colon-less-tp8123p8126.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Fail to cache rdd: java.lang.NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
Hi team, I was trying to cache dataframe in Ignite Cache. I was able to cache generic type data elements(RDD). However each time when I use igniteRDDF.saveValues() to cache a non-generic data type(e.g. RDD), it will trigger the noSuchMethod for saveValues as following shows. I am using scala 2.10, Spark 1.6.1 and Ignite 1.6.0. I did find the ignite-spark was pulling in spark-core_2.11. After excluding native spark from ignite-spark, I still got the same error. Any suggestions? Thanks in advance! Code sample: igniteRDD.saveValues(df.rdd()); Exceptions I got: Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:920) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:918) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:918) at org.apache.ignite.spark.IgniteRDD.saveValues(IgniteRDD.scala:138) at ignitecontext.IgniteRDDExample.run(IgniteRDDExample.java:81) at ignitecontext.IgniteRDDExample.main(IgniteRDDExample.java:35) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$$less$colon$less; at org.apache.ignite.spark.IgniteRDD$$anonfun$saveValues$1$$anonfun$apply$1.apply(IgniteRDD.scala:151) at org.apache.ignite.spark.IgniteRDD$$anonfun$saveValues$1$$anonfun$apply$1.apply(IgniteRDD.scala:150) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at org.apache.ignite.spark.IgniteRDD$$anonfun$saveValues$1.apply(IgniteRDD.scala:150) at org.apache.ignite.spark.IgniteRDD$$anonfun$saveValues$1.apply(IgniteRDD.scala:138) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$33.apply(RDD.scala:920) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$33.apply(RDD.scala:920) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at
Could IgniteCache be accessed thr Spark JDBC data source api?
Hey team, I was able to use JDBC driver tool to access IgniteCache. Is it possible to connect to Ignite thr Spark JDBC data source api? Below are the code and exceptions I got. It seems like the connection is successful but there are datatype mapping issues. Do I need to define some schema from the server side? Any suggestion would be really helpful for me. Thanks in advance! Sample Code: val props = Map( "url" -> "jdbc:ignite://ip:11211/TestCache", "driver" -> "org.apache.ignite.IgniteJdbcDriver", "dbtable" -> "Person" ) val personDF = ctx.read.format("jdbc").options(props).load() Exceptions: Exception in thread "main" java.sql.SQLException: Unsupported type at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.org$apache$spark$sql$execution$datasources$jdbc$JDBCRDD$$getCatalystType(JDBCRDD.scala:103) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anonfun$1.apply(JDBCRDD.scala:140) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anonfun$1.apply(JDBCRDD.scala:140) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:139) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:91) at org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:60) at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119) at ignite.spark.DataFrameLoader$.personTest(DataFrameLoader.scala:131) at ignite.spark.DataFrameLoader.personTest(DataFrameLoader.scala) at jdbc.PersonLoader.executeTransaction(PersonLoader.java:175) at jdbc.PersonLoader.main(PersonLoader.java:115) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Could-IgniteCache-be-accessed-thr-Spark-JDBC-data-source-api-tp8122.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
test
test -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/test-tp8120.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Does IgniteCache could be access thr Spark JDBC data source api?
Hey team, I was able to use JDBC driver tool to access IgniteCache. Is it possible to connect to Ignite thr Spark JDBC data source api? Below are the code and exceptions I got. It seems like the connection is successful but there are datatype mapping issues. Do I need to define some schema from the server side? Any suggestion would be really helpful for me. Thanks in advance! Sample Code: val props = Map( "url" -> "jdbc:ignite://ip:11211/TestCache", "driver" -> "org.apache.ignite.IgniteJdbcDriver", "dbtable" -> "Person" ) val personDF = ctx.read.format("jdbc").options(props).load() Exceptions: Exception in thread "main" java.sql.SQLException: Unsupported type at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.org$apache$spark$sql$execution$datasources$jdbc$JDBCRDD$$getCatalystType(JDBCRDD.scala:103) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anonfun$1.apply(JDBCRDD.scala:140) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anonfun$1.apply(JDBCRDD.scala:140) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:139) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:91) at org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:60) at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119) at ignite.spark.DataFrameLoader$.personTest(DataFrameLoader.scala:131) at ignite.spark.DataFrameLoader.personTest(DataFrameLoader.scala) at jdbc.PersonLoader.executeTransaction(PersonLoader.java:175) at jdbc.PersonLoader.main(PersonLoader.java:115) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Does-IgniteCache-could-be-access-thr-Spark-JDBC-data-source-api-tp8117.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Fail to cache rdd: java.lang.NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
Hi team, I was trying to cache dataframe in Ignite Cache. I was able to cache generic type data elements(RDD). However each time when I use igniteRDDF.saveValues() to cache a non-generic data type(e.g. RDD), it will trigger the noSuchMethod for saveValues as following shows. I am using scala 2.10, Spark 1.6.1 and Ignite 1.6.0. I did find the ignite-spark was pulling in spark-core_2.11. After excluding native spark from ignite-spark, I still got the same error. Any suggestions? Thanks in advance! Code sample: igniteRDD.saveValues(df.rdd()); Exceptions I got: Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:920) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:918) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:918) at org.apache.ignite.spark.IgniteRDD.saveValues(IgniteRDD.scala:138) at ignitecontext.IgniteRDDExample.run(IgniteRDDExample.java:81) at ignitecontext.IgniteRDDExample.main(IgniteRDDExample.java:35) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$$less$colon$less; at org.apache.ignite.spark.IgniteRDD$$anonfun$saveValues$1$$anonfun$apply$1.apply(IgniteRDD.scala:151) at org.apache.ignite.spark.IgniteRDD$$anonfun$saveValues$1$$anonfun$apply$1.apply(IgniteRDD.scala:150) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at org.apache.ignite.spark.IgniteRDD$$anonfun$saveValues$1.apply(IgniteRDD.scala:150) at org.apache.ignite.spark.IgniteRDD$$anonfun$saveValues$1.apply(IgniteRDD.scala:138) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$33.apply(RDD.scala:920) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$33.apply(RDD.scala:920) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at