Re: HDFS Caching

2016-04-29 Thread vkulichenko
Hi,

As far as I know, IGFS can redirect to secondary file system not only to
read from it, but also for integrity purposes (e.g., to check if the file in
secondary FS was updated directly, without updating IGFS). In any case, the
data itself will be read from memory if it is there. I would try to create
several larger files and see if adding IGFS improves performance.

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/HDFS-Caching-tp4695p4713.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Running gridgain yardstick

2016-04-29 Thread vkulichenko
Hi,

How do you build the project?

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Running-gridgain-yardstick-tp4559p4710.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Client fails to connect - joinTimeout vs networkTimeout

2016-04-29 Thread vkulichenko
Caches always use affinity, it defines how the data is distributed across
nodes. If you don't explicitly provide it in the configuration,
RendezvousAffinityFunction will be used with excludeNeighbors=false. So if
you want to enable this feature, you have to specify this in the
configuration.

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Client-fails-to-connect-joinTimeout-vs-networkTimeout-tp4419p4709.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Number of partitions of IgniteRDD

2016-04-29 Thread vkulichenko
Vij,

This is because you're checking partitions for result DataFrame. IgniteRDD
queries Ignite directly, gets the result back and wraps it into the
DataFrame. If you do any transformations with this DataFrame, they are not
parallelized and are done on the driver. Thus only one partition returned.

To get correct number of partitions for IgniteRDD, do like this:

jic.fromCache(PARTITIONED_CACHE_NAME).getPartitions();

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Number-of-partitions-of-IgniteRDD-tp4644p4708.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Client fails to connect - joinTimeout vs networkTimeout

2016-04-29 Thread bintisepaha
Val, thanks a lot. Will this also work if the caches do not use affinity?
We are trying not to use affinity because our data is very skewed.



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Client-fails-to-connect-joinTimeout-vs-networkTimeout-tp4419p4706.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Affinitykey is not working

2016-04-29 Thread Alexey Goncharuk
Hi,

Scala does not automatically place annotations to generated fields, you
need to use the annotation as follows:

@(AffinityKeyMapped @field) val marketSectorId:Int = 0


Error starting c++ client node using 1.6

2016-04-29 Thread Murthy Kakarlamudi
Hi All,
I downloaded the latest 1.6 binary from latest builds. I am trying to
start a node from c++ and getting the below error.

An error occurred: Failed to initialize JVM
[errCls=java.lang.NoSuchMethodError, errMsg=executeNative]

The same c++  node starts fine if I point my IGNITE_HOME to 1.5 instead of
1.6.

Any help is much appreciated...

Thanks.


Re: Number of partitions of IgniteRDD

2016-04-29 Thread vijayendra bhati
My bad !!! Yea you are right.But now the problem is when I get DataFrame by 
using below code and get JavaRDD from it , its number of partitions is 1 and 
its of type  MapPartitionsRDD.
String sql = "select simulationUUID,stockReturn from STOCKSIMULATIONRETURNSVAL 
where businessDate = ? and symbol = ?";DataFrame df 
=jic.fromCache(PARTITIONED_CACHE_NAME).sql(sql, businessDate,stock); 
df.javaRDD();

This is causing performance issues for me as I have only 1 partition and my 
reduceByKey method is not performing in desired way.
Regards,Vij
 

On Friday, April 29, 2016 6:11 PM, Vladimir Ozerov  
wrote:
 

 Hi Vij,
I see method "getPartitions" in IgniteRDD, not "getNumPartitions". Please 
confirm that we are talking about the same thing.
Anyway, logic of this method is extremely straightforward - it simply call 
Ignite.affinity("name_of_your_cache").partitions() method, so it should return 
actual number of partitions. "getPartitions" returns array, could you please 
show is printed to the console from your code?

Vladimir.
On Fri, Apr 29, 2016 at 3:10 PM, vijayendra bhati  
wrote:

Yes its Spark RDD's standard method, but it has been overridden in IgniteRDD.
Regards,Vij 

On Friday, April 29, 2016 5:25 PM, Vladimir Ozerov  
wrote:
 

 Hi Vij,
I am not quite uderstand where does method "getNumPartitions" came from. Is it 
on standard Spark API? I do not see it on org.apache.spark.api.java.JavaRDD 
class.
Vladimir.
On Fri, Apr 29, 2016 at 7:50 AM, vijayendra bhati  
wrote:

Hi Val,
I am creating DataFrame using below code - 
 public DataFrame getStockSimulationReturnsDataFrame(LocalDate 
businessDate,String stock){ /*  * If we use sql query , we are assuming that 
data is in cache.  * */ String sql = "select simulationUUID,stockReturn from 
STOCKSIMULATIONRETURNSVAL where businessDate = ? and symbol = ?"; DataFrame df 
=jic.fromCache(PARTITIONED_CACHE_NAME).sql(sql, businessDate,stock); return df; 
}

And to check partitions I am doing - 
private JavaRDD getSimulationsForStock(String stock,LocalDate 
businessDate) { DataFrame df =  
StockSimulationsReaderFactory.getStockSimulationStore(jsc, businessDate, 
businessDate).getStockSimulationReturnsDataFrame(businessDate, stock); 
System.out.println(""+df.javaRDD().getNumPartitions()); return 
df.javaRDD(); }
Regards,Vij 

On Friday, April 29, 2016 3:09 AM, vkulichenko 
 wrote:
 

 Hi Vij,

How do you check the number of partitions and what are you trying to
achieve? Can you show the code?

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Number-of-partitions-of-IgniteRDD-tp4644p4671.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


   



   



  

HDFS Caching

2016-04-29 Thread barham
I'm running Ignite 1.5.0 Hadoop Accelerator version on top of CDH 5.  I'm
trying to write my own SecondaryFileSystem, but as a first step, I created
one that just funnels all of the calls down to the
IgniteHadoopIgfsSecondaryFileSystem and I just log out every time one of my
methods is called.  I'm using the default configuration provided in the
Hadoop Accelerator binary distribution except I added my secondary file
system to the configuration.  

Every time I run hadoop fs -cat  from the command line or 
ignite.fileSystem("igfs").open() from inside a java app, my log
statement in my SecondaryFileSystem's open method is printed out.  Even if I
read the same file over and over.  To me, that means my files aren't being
cached inside Ignite (which is the reason I'm looking into Ignite).  I feel
like I must be missing something obvious.  I tried creating a tiny (10 byte)
ASCII text file and reading that in case my files were too big in HDFS.

Thanks for any help.



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/HDFS-Caching-tp4695.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Issue with Java 8 datatype LocalDate while using IgniteRDD

2016-04-29 Thread vijayendra bhati
IgniteRDD makes the value of type struct instead of LocalDate and hence below 
exception comes -
Caused by: scala.MatchError: 2016-03-17 (of class java.time.LocalDate) at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:255)
 at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:250)
 at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:102)
 at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:260)
 at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:250)
 at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:102)
 at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$2.apply(CatalystTypeConverters.scala:401)


Looks like in IgniteRDD.scala new type needs to be added in method - 
private def dataType(typeName: String): DataType 

Regards,Vij 

On Friday, April 29, 2016 5:26 PM, Vladimir Ozerov  
wrote:
 

 Hi Vij,
Do you see any exception or some other kind of error? Please provide more error 
description.
Vladimir.
On Fri, Apr 29, 2016 at 2:47 PM, vijayendra bhati  
wrote:

Hi Guys, 
I am trying to store a object which contains object of type LocalDate datatype 
of Java8 time's APII am facing issues over it while working with IgniteRDDLooks 
like LocalDate is not handle in IgniteRDD and may be in Spark as well
Anybody can help here ?
Regards,Vij



  

Re: SQL Aliases are not interpreted correctly

2016-04-29 Thread jan.swaelens
Hello,

Thanks works like a charm! Up to the next level in my experiment.

br
jan



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/SQL-Aliases-are-not-interpreted-correctly-tp4281p4692.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Number of partitions of IgniteRDD

2016-04-29 Thread Vladimir Ozerov
Hi Vij,

I see method "getPartitions" in IgniteRDD, not "getNumPartitions". Please
confirm that we are talking about the same thing.

Anyway, logic of this method is extremely straightforward - it simply call
Ignite.affinity("name_of_your_cache").partitions() method, so it should
return actual number of partitions.
"getPartitions" returns array, could you please show is printed to the
console from your code?


Vladimir.

On Fri, Apr 29, 2016 at 3:10 PM, vijayendra bhati 
wrote:

> Yes its Spark RDD's standard method, but it has been overridden in
> IgniteRDD.
>
> Regards,
> Vij
>
>
> On Friday, April 29, 2016 5:25 PM, Vladimir Ozerov 
> wrote:
>
>
> Hi Vij,
>
> I am not quite uderstand where does method "getNumPartitions" came from.
> Is it on standard Spark API? I do not see it on*
> org.apache.spark.api.java.JavaRDD* class.
>
> Vladimir.
>
> On Fri, Apr 29, 2016 at 7:50 AM, vijayendra bhati 
> wrote:
>
> Hi Val,
>
> I am creating DataFrame using below code -
>
> * public DataFrame getStockSimulationReturnsDataFrame(LocalDate
> businessDate,String stock){*
> * /**
> * * If we use sql query , we are assuming that data is in cache.*
> * * */*
> * String sql = "select simulationUUID,stockReturn from
> STOCKSIMULATIONRETURNSVAL where businessDate = ? and symbol = ?";*
> * DataFrame df =jic.fromCache(PARTITIONED_CACHE_NAME).sql(sql,
> businessDate,stock);*
> * return df;*
> * }*
>
>
> And to check partitions I am doing -
>
> private JavaRDD getSimulationsForStock(String stock,LocalDate
> businessDate)
> {
> DataFrame df =  StockSimulationsReaderFactory.getStockSimulationStore(jsc,
> businessDate,
> businessDate).getStockSimulationReturnsDataFrame(businessDate, stock);
> System.out.println(""+df.javaRDD().getNumPartitions());
> return df.javaRDD();
> }
>
> Regards,
> Vij
>
>
> On Friday, April 29, 2016 3:09 AM, vkulichenko <
> valentin.kuliche...@gmail.com> wrote:
>
>
> Hi Vij,
>
>
> How do you check the number of partitions and what are you trying to
> achieve? Can you show the code?
>
> -Val
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/Number-of-partitions-of-IgniteRDD-tp4644p4671.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>
>
>
>
>
>
>


Re: Affinitykey is not working

2016-04-29 Thread Vladimir Ozerov
Hi,

Could you please explain how do you detect a node to which key is mapped?
Do you use Affinity API?

Vladimir.

On Fri, Apr 29, 2016 at 11:48 AM, nikhilknk  wrote:

> I used the below ToleranceCacheKey  as the key . I want to keep all the
> keys
> whose  marketSectorId is same in the same node . So I kept annotation
> "@AffinityKeyMapped" for marketSectorId .
>
> I started 3 nodes of ignite cluster but the instruments of same
> marketSectorId  are sahred among three nodes .
>
> Is affinity doesnt work in this case . please suggest if i am missing
> anything .
>
> I am using ignite 1.5 and scala 2.10.5 versions .
>
> /**
>  *
>  */
> package com.spse.pricing.domain
>
> import java.util.Date
> import org.apache.ignite.cache.affinity.AffinityKeyMapped
> /**
>  * @author nkakkireni
>  *
>  */
> case class ToleranceCacheKey (
>
> val instrumentId:String = null,
> val cycleId:Int = 0,
> @AffinityKeyMapped val marketSectorId:Int = 0,
> val runDate :Date = null
> )
>
>
>
> my cache configuration
>
> val toleranceCache = {
> val temp = ignite match {
>   case Some(s) => {
>
>  val toleranceCache = new
>
> CacheConfiguration[ToleranceCacheKey,ToleranceCacheValue]("toleranceCache");
> toleranceCache.setCacheMode(CacheMode.PARTITIONED);
> toleranceCache.setTypeMetadata(toleranceCacheMetadata());
>
> val cache = s.getOrCreateCache(toleranceCache)
> cache
>   }
>   case _ => logError("Getting toleranceCache cache failed")
> throw new Throwable("Getting toleranceCache cache failed")
>
> }
> temp
>
>   }
>
> def toleranceCacheMetadata() = {
> val types = new ArrayList[CacheTypeMetadata]();
>
> val cacheType = new CacheTypeMetadata();
> cacheType.setValueType(classOf[ToleranceCacheValue].getName);
>
>   val qryFlds = cacheType.getQueryFields();
> qryFlds.put("tradingGroupId", classOf[Int]);
>
> val indexedFlds=cacheType.getAscendingFields
> indexedFlds.put("instrumentId", classOf[String]);
> indexedFlds.put("cycleId", classOf[Int]);
> indexedFlds.put("runDate", classOf[Date]);
> indexedFlds.put("marketSectorId", classOf[Int]);
>
> types.add(cacheType);
>
> types;
> }
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/Affinitykey-is-not-working-tp4685.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>


Re: Issue with Java 8 datatype LocalDate while using IgniteRDD

2016-04-29 Thread Vladimir Ozerov
Hi Vij,

Do you see any exception or some other kind of error? Please provide more
error description.

Vladimir.

On Fri, Apr 29, 2016 at 2:47 PM, vijayendra bhati 
wrote:

> Hi Guys,
>
> I am trying to store a object which contains object of type LocalDate
> datatype of Java8 time's API
> I am facing issues over it while working with IgniteRDD
> Looks like LocalDate is not handle in IgniteRDD and may be in Spark as well
>
> Anybody can help here ?
>
> Regards,
> Vij
>


Re: Ignite Installation with Spark under CDH

2016-04-29 Thread Vladimir Ozerov
Hi Michael,

Ok, so it looks like the process didn't have enough heap.
Thank you for your inputs about CDH configuration. We will improve our
documentation based on this.

Vladimir

On Thu, Apr 28, 2016 at 5:15 PM, mdolgonos 
wrote:

> Vladimir,
>
> I fixed this by changing the way I start Ignite based on a recommendation
> from another post here and the OOME has gone:
> ignite.sh -J-Xmx10g
> The data that I put in cache is about 1.5GB
>
> Thank you,
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/Ignite-Installation-with-Spark-under-CDH-tp4457p4665.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>