Re: HDFS Caching
Hi, As far as I know, IGFS can redirect to secondary file system not only to read from it, but also for integrity purposes (e.g., to check if the file in secondary FS was updated directly, without updating IGFS). In any case, the data itself will be read from memory if it is there. I would try to create several larger files and see if adding IGFS improves performance. -Val -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/HDFS-Caching-tp4695p4713.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Running gridgain yardstick
Hi, How do you build the project? -Val -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Running-gridgain-yardstick-tp4559p4710.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Client fails to connect - joinTimeout vs networkTimeout
Caches always use affinity, it defines how the data is distributed across nodes. If you don't explicitly provide it in the configuration, RendezvousAffinityFunction will be used with excludeNeighbors=false. So if you want to enable this feature, you have to specify this in the configuration. -Val -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Client-fails-to-connect-joinTimeout-vs-networkTimeout-tp4419p4709.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Number of partitions of IgniteRDD
Vij, This is because you're checking partitions for result DataFrame. IgniteRDD queries Ignite directly, gets the result back and wraps it into the DataFrame. If you do any transformations with this DataFrame, they are not parallelized and are done on the driver. Thus only one partition returned. To get correct number of partitions for IgniteRDD, do like this: jic.fromCache(PARTITIONED_CACHE_NAME).getPartitions(); -Val -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Number-of-partitions-of-IgniteRDD-tp4644p4708.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Client fails to connect - joinTimeout vs networkTimeout
Val, thanks a lot. Will this also work if the caches do not use affinity? We are trying not to use affinity because our data is very skewed. -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Client-fails-to-connect-joinTimeout-vs-networkTimeout-tp4419p4706.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Affinitykey is not working
Hi, Scala does not automatically place annotations to generated fields, you need to use the annotation as follows: @(AffinityKeyMapped @field) val marketSectorId:Int = 0
Error starting c++ client node using 1.6
Hi All, I downloaded the latest 1.6 binary from latest builds. I am trying to start a node from c++ and getting the below error. An error occurred: Failed to initialize JVM [errCls=java.lang.NoSuchMethodError, errMsg=executeNative] The same c++ node starts fine if I point my IGNITE_HOME to 1.5 instead of 1.6. Any help is much appreciated... Thanks.
Re: Number of partitions of IgniteRDD
My bad !!! Yea you are right.But now the problem is when I get DataFrame by using below code and get JavaRDD from it , its number of partitions is 1 and its of type MapPartitionsRDD. String sql = "select simulationUUID,stockReturn from STOCKSIMULATIONRETURNSVAL where businessDate = ? and symbol = ?";DataFrame df =jic.fromCache(PARTITIONED_CACHE_NAME).sql(sql, businessDate,stock); df.javaRDD(); This is causing performance issues for me as I have only 1 partition and my reduceByKey method is not performing in desired way. Regards,Vij On Friday, April 29, 2016 6:11 PM, Vladimir Ozerovwrote: Hi Vij, I see method "getPartitions" in IgniteRDD, not "getNumPartitions". Please confirm that we are talking about the same thing. Anyway, logic of this method is extremely straightforward - it simply call Ignite.affinity("name_of_your_cache").partitions() method, so it should return actual number of partitions. "getPartitions" returns array, could you please show is printed to the console from your code? Vladimir. On Fri, Apr 29, 2016 at 3:10 PM, vijayendra bhati wrote: Yes its Spark RDD's standard method, but it has been overridden in IgniteRDD. Regards,Vij On Friday, April 29, 2016 5:25 PM, Vladimir Ozerov wrote: Hi Vij, I am not quite uderstand where does method "getNumPartitions" came from. Is it on standard Spark API? I do not see it on org.apache.spark.api.java.JavaRDD class. Vladimir. On Fri, Apr 29, 2016 at 7:50 AM, vijayendra bhati wrote: Hi Val, I am creating DataFrame using below code - public DataFrame getStockSimulationReturnsDataFrame(LocalDate businessDate,String stock){ /* * If we use sql query , we are assuming that data is in cache. * */ String sql = "select simulationUUID,stockReturn from STOCKSIMULATIONRETURNSVAL where businessDate = ? and symbol = ?"; DataFrame df =jic.fromCache(PARTITIONED_CACHE_NAME).sql(sql, businessDate,stock); return df; } And to check partitions I am doing - private JavaRDD getSimulationsForStock(String stock,LocalDate businessDate) { DataFrame df = StockSimulationsReaderFactory.getStockSimulationStore(jsc, businessDate, businessDate).getStockSimulationReturnsDataFrame(businessDate, stock); System.out.println(""+df.javaRDD().getNumPartitions()); return df.javaRDD(); } Regards,Vij On Friday, April 29, 2016 3:09 AM, vkulichenko wrote: Hi Vij, How do you check the number of partitions and what are you trying to achieve? Can you show the code? -Val -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Number-of-partitions-of-IgniteRDD-tp4644p4671.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
HDFS Caching
I'm running Ignite 1.5.0 Hadoop Accelerator version on top of CDH 5. I'm trying to write my own SecondaryFileSystem, but as a first step, I created one that just funnels all of the calls down to the IgniteHadoopIgfsSecondaryFileSystem and I just log out every time one of my methods is called. I'm using the default configuration provided in the Hadoop Accelerator binary distribution except I added my secondary file system to the configuration. Every time I run hadoop fs -cat from the command line or ignite.fileSystem("igfs").open() from inside a java app, my log statement in my SecondaryFileSystem's open method is printed out. Even if I read the same file over and over. To me, that means my files aren't being cached inside Ignite (which is the reason I'm looking into Ignite). I feel like I must be missing something obvious. I tried creating a tiny (10 byte) ASCII text file and reading that in case my files were too big in HDFS. Thanks for any help. -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/HDFS-Caching-tp4695.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Issue with Java 8 datatype LocalDate while using IgniteRDD
IgniteRDD makes the value of type struct instead of LocalDate and hence below exception comes - Caused by: scala.MatchError: 2016-03-17 (of class java.time.LocalDate) at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:255) at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:250) at org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:102) at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:260) at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:250) at org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:102) at org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$2.apply(CatalystTypeConverters.scala:401) Looks like in IgniteRDD.scala new type needs to be added in method - private def dataType(typeName: String): DataType Regards,Vij On Friday, April 29, 2016 5:26 PM, Vladimir Ozerovwrote: Hi Vij, Do you see any exception or some other kind of error? Please provide more error description. Vladimir. On Fri, Apr 29, 2016 at 2:47 PM, vijayendra bhati wrote: Hi Guys, I am trying to store a object which contains object of type LocalDate datatype of Java8 time's APII am facing issues over it while working with IgniteRDDLooks like LocalDate is not handle in IgniteRDD and may be in Spark as well Anybody can help here ? Regards,Vij
Re: SQL Aliases are not interpreted correctly
Hello, Thanks works like a charm! Up to the next level in my experiment. br jan -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/SQL-Aliases-are-not-interpreted-correctly-tp4281p4692.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Number of partitions of IgniteRDD
Hi Vij, I see method "getPartitions" in IgniteRDD, not "getNumPartitions". Please confirm that we are talking about the same thing. Anyway, logic of this method is extremely straightforward - it simply call Ignite.affinity("name_of_your_cache").partitions() method, so it should return actual number of partitions. "getPartitions" returns array, could you please show is printed to the console from your code? Vladimir. On Fri, Apr 29, 2016 at 3:10 PM, vijayendra bhatiwrote: > Yes its Spark RDD's standard method, but it has been overridden in > IgniteRDD. > > Regards, > Vij > > > On Friday, April 29, 2016 5:25 PM, Vladimir Ozerov > wrote: > > > Hi Vij, > > I am not quite uderstand where does method "getNumPartitions" came from. > Is it on standard Spark API? I do not see it on* > org.apache.spark.api.java.JavaRDD* class. > > Vladimir. > > On Fri, Apr 29, 2016 at 7:50 AM, vijayendra bhati > wrote: > > Hi Val, > > I am creating DataFrame using below code - > > * public DataFrame getStockSimulationReturnsDataFrame(LocalDate > businessDate,String stock){* > * /** > * * If we use sql query , we are assuming that data is in cache.* > * * */* > * String sql = "select simulationUUID,stockReturn from > STOCKSIMULATIONRETURNSVAL where businessDate = ? and symbol = ?";* > * DataFrame df =jic.fromCache(PARTITIONED_CACHE_NAME).sql(sql, > businessDate,stock);* > * return df;* > * }* > > > And to check partitions I am doing - > > private JavaRDD getSimulationsForStock(String stock,LocalDate > businessDate) > { > DataFrame df = StockSimulationsReaderFactory.getStockSimulationStore(jsc, > businessDate, > businessDate).getStockSimulationReturnsDataFrame(businessDate, stock); > System.out.println(""+df.javaRDD().getNumPartitions()); > return df.javaRDD(); > } > > Regards, > Vij > > > On Friday, April 29, 2016 3:09 AM, vkulichenko < > valentin.kuliche...@gmail.com> wrote: > > > Hi Vij, > > > How do you check the number of partitions and what are you trying to > achieve? Can you show the code? > > -Val > > > > -- > View this message in context: > http://apache-ignite-users.70518.x6.nabble.com/Number-of-partitions-of-IgniteRDD-tp4644p4671.html > Sent from the Apache Ignite Users mailing list archive at Nabble.com. > > > > > > >
Re: Affinitykey is not working
Hi, Could you please explain how do you detect a node to which key is mapped? Do you use Affinity API? Vladimir. On Fri, Apr 29, 2016 at 11:48 AM, nikhilknkwrote: > I used the below ToleranceCacheKey as the key . I want to keep all the > keys > whose marketSectorId is same in the same node . So I kept annotation > "@AffinityKeyMapped" for marketSectorId . > > I started 3 nodes of ignite cluster but the instruments of same > marketSectorId are sahred among three nodes . > > Is affinity doesnt work in this case . please suggest if i am missing > anything . > > I am using ignite 1.5 and scala 2.10.5 versions . > > /** > * > */ > package com.spse.pricing.domain > > import java.util.Date > import org.apache.ignite.cache.affinity.AffinityKeyMapped > /** > * @author nkakkireni > * > */ > case class ToleranceCacheKey ( > > val instrumentId:String = null, > val cycleId:Int = 0, > @AffinityKeyMapped val marketSectorId:Int = 0, > val runDate :Date = null > ) > > > > my cache configuration > > val toleranceCache = { > val temp = ignite match { > case Some(s) => { > > val toleranceCache = new > > CacheConfiguration[ToleranceCacheKey,ToleranceCacheValue]("toleranceCache"); > toleranceCache.setCacheMode(CacheMode.PARTITIONED); > toleranceCache.setTypeMetadata(toleranceCacheMetadata()); > > val cache = s.getOrCreateCache(toleranceCache) > cache > } > case _ => logError("Getting toleranceCache cache failed") > throw new Throwable("Getting toleranceCache cache failed") > > } > temp > > } > > def toleranceCacheMetadata() = { > val types = new ArrayList[CacheTypeMetadata](); > > val cacheType = new CacheTypeMetadata(); > cacheType.setValueType(classOf[ToleranceCacheValue].getName); > > val qryFlds = cacheType.getQueryFields(); > qryFlds.put("tradingGroupId", classOf[Int]); > > val indexedFlds=cacheType.getAscendingFields > indexedFlds.put("instrumentId", classOf[String]); > indexedFlds.put("cycleId", classOf[Int]); > indexedFlds.put("runDate", classOf[Date]); > indexedFlds.put("marketSectorId", classOf[Int]); > > types.add(cacheType); > > types; > } > > > > -- > View this message in context: > http://apache-ignite-users.70518.x6.nabble.com/Affinitykey-is-not-working-tp4685.html > Sent from the Apache Ignite Users mailing list archive at Nabble.com. >
Re: Issue with Java 8 datatype LocalDate while using IgniteRDD
Hi Vij, Do you see any exception or some other kind of error? Please provide more error description. Vladimir. On Fri, Apr 29, 2016 at 2:47 PM, vijayendra bhatiwrote: > Hi Guys, > > I am trying to store a object which contains object of type LocalDate > datatype of Java8 time's API > I am facing issues over it while working with IgniteRDD > Looks like LocalDate is not handle in IgniteRDD and may be in Spark as well > > Anybody can help here ? > > Regards, > Vij >
Re: Ignite Installation with Spark under CDH
Hi Michael, Ok, so it looks like the process didn't have enough heap. Thank you for your inputs about CDH configuration. We will improve our documentation based on this. Vladimir On Thu, Apr 28, 2016 at 5:15 PM, mdolgonoswrote: > Vladimir, > > I fixed this by changing the way I start Ignite based on a recommendation > from another post here and the OOME has gone: > ignite.sh -J-Xmx10g > The data that I put in cache is about 1.5GB > > Thank you, > > > > -- > View this message in context: > http://apache-ignite-users.70518.x6.nabble.com/Ignite-Installation-with-Spark-under-CDH-tp4457p4665.html > Sent from the Apache Ignite Users mailing list archive at Nabble.com. >