Re: Issues running Ignite with Cassandra and spark.

2018-10-01 Thread Shrey Garg
I fixed the issue with dataframe api and am getting all columns now.
However, I am not able to perform grouping + udaf operations as it tries to
perform these on ignite.
setting OPTION_DISABLE_SPARK_SQL_OPTIMIZATION = true is not helping.

How so we tell ignite to just fetch data and perform all other operations
in spark?


Re: Issues running Ignite with Cassandra and spark.

2018-10-01 Thread Shrey Garg
Hi,
Thanks for the answer.
Unfortunately, we cannot remove Cassandra as it is being used elsewhere as
well. We will have to write directly in ignite and sync with cassandra.

We had a few other issues while getting data from spark:

1) cacherdd.sql("select * from table") is giving me heap memory (GC)
issues. However, getting data using spark.read.format() works fine. Why
is this so ?

2) in my persistence, i have IndexedTypes with key and value POJO classes.
The key class corresponds to the key in cassandra with partition and
clustering keys defined. While querying with sql, (select * from
value_class) i get all the columns of the table. However, while querying
using spark.read.format(...).option(OPTION_TABLE,value_class).load() , I
only get the columns stored in the value class. How do i fetch all the
columns using dataframe api ?

Thanks,
Shrey



On Fri, 28 Sep 2018, 08:43 Alexey Kuznetsov,  wrote:

> Hi,  Shrey!
>
> Just as idea - Ignite now has persistence (see
> https://apacheignite.readme.io/docs/distributed-persistent-store),
>  may be you can completely replace  Cassandra with Ignite?
>
> In this case all data always be actual, no need to sync with external db.
>
> --
> Alexey Kuznetsov
>


Re: Issues running Ignite with Cassandra and spark.

2018-09-27 Thread Alexey Kuznetsov
Hi,  Shrey!

Just as idea - Ignite now has persistence (see
https://apacheignite.readme.io/docs/distributed-persistent-store),
 may be you can completely replace  Cassandra with Ignite?

In this case all data always be actual, no need to sync with external db.

-- 
Alexey Kuznetsov


Re: Issues running Ignite with Cassandra and spark.

2018-09-27 Thread ilya.kasnacheev
Hello!

1) There is no generic way of pulling updates from 3rd party database and
there is no API support for it usually, so it's not obvious how we could
implement that even if we wanted.

2) By default cache store will process data in parallel on all nodes.
However if will not align data distribution with that of cassandra, and I
would say that implementing it will be infeasible. However, you could try to
see if there are ways to speed up loadCache by tuning Ignite and-or cache
configurations.

Regards,



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Issues running Ignite with Cassandra and spark.

2018-09-26 Thread Shrey
Hi, we are using Ignite as a cache layer over Cassandra for faster read
queries using spark. Our cluster has 10 nodes running an instance of
Cassandra and Ignite. However, we came across a few issues:

1)  We currently store the data from spark to cassandra. Hence to load data,
we need to call .loadCache() . I know there are ways for data written in
Ignite to be synced with cassandra (writeBehind, writeThroughs) . However we
want to do the opposite. Load in cassandra and want it to be reflected in
the cache which can be queries by spark. Is there a way to do so ?

2) To load data into the cache from Cassandra, I start a new client in
another machine and call the .loadCache() method. However, it takes almost
45 minutes to load the data (around 30 million rows with 20 columns each) .
Is there a way to make this faster by ensuring that data from a particular
node in cassandra cluster is parallelly loaded to the cache instance of the
same node ? I have defined my partition and clustering columns in the my
spring persistance-settings.

Thanks,
Shrey



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Ignite to Cassandra change from 1.9 to 2.0

2017-06-06 Thread Kenan Dalley
t
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at 
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
at java.lang.Thread.run(Thread.java:745)
Disconnected from the target VM, address: '127.0.0.1:39960', transport:
'socket'

Process finished with exit code 1


HistoryResult Class

public class HistoryResult {
@QuerySqlField
private String key;

@QuerySqlField(name = "session_id")
private String sessionId;

@QuerySqlField(name = "session_time")
private Date sessionTime;

@QuerySqlField(name = "algorithm_name")
private String algorithmName;

@QueryTextField
private String results;

@QuerySqlField(name = "analysis_time")
private Date analysisTime;

@QuerySqlField(name = "created_dt")
private Date createdDate;

@QuerySqlField(name = "created_by")
private String createdBy;

@QuerySqlField(name = "modified_dt")
private Date modifiedDate;

@QuerySqlField(name = "modified_by")
private String modifiedBy;
...

HistoryResultKey Class

public class HistoryResultKey {
@AffinityKeyMapped
@QuerySqlField(index = true, groups = { "historyResultPK" })
private String key;

@QuerySqlField(index = true, groups = { "historyResultPK" }, name =
"session_id")
private String sessionId;

@QuerySqlField(index = true, groups = { "historyResultPK" }, name =
"algorithm_name")
private String algorithmName;
...

Persistence Settings [DOES NOT WORK]



comment = 'Test table for Ignite/Cassandra connection'
AND read_repair_chance = 0.2







Persistence Settings [DOES WORK]



comment = 'Test table for Ignite/Cassandra connection'
AND read_repair_chance = 0.2




























--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-to-Cassandra-change-from-1-9-to-2-0-tp13099p13422.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite to Cassandra change from 1.9 to 2.0

2017-05-30 Thread Kenan Dalley
First, I do have setters on the class, just didn't include them for space
reasons.  Second, this is the exact opposite from what I was told originally
back January for v1.8.  So, this looks like it was a breaking change
introduced in v2.0.  I haven't checked the latest "what's new" recently. 
Was there a mention in there about this change?


Previous responses to my questions back in January:

===

Igor Rudyakonline  Igor Rudyak 
Jan 04, 2017; 8:41pm   Re: Ignite with Cassandra questions / errors  

Ok, 

I took a look at the HistoryResult implementation once again and found the
reason. Your java class should follow JavaBeans Conventions. The most
important here is that you class should implement getters/setters methods
for READ/WRITE properties or getters for READ-ONLY properties. 

In your case you just have a class with private members annotated by
@QuerySqlField - this will not work. You should implement getters/setters
methods for these private fields and then annotate getter or setter with
@QuerySqlField  

===

Igor Rudyakonline  Igor Rudyak 
Jan 05, 2017; 11:51am   Re: Ignite with Cassandra questions / errors  

Hi Kenan, 

You missed the main point - getters or setters of your custom classes should
be annotated with @QuerySqlField instead of class private members. Here is a
slightly modified version of your custom classes which should work: 
 




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-to-Cassandra-change-from-1-9-to-2-0-tp13099p13247.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite to Cassandra change from 1.9 to 2.0

2017-05-25 Thread Igor Rudyak
The second problem is that your *HistoryResultKey* doesn't have setters. It
will not work without setters - your POJO classes should follow Java Beans
convention.



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-to-Cassandra-change-from-1-9-to-2-0-tp13099p13161.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite to Cassandra change from 1.9 to 2.0

2017-05-25 Thread Igor Rudyak
First of all - annotations *@QuerySqlField* and *@QueryTextField* are no
longer supported for the methods. Because of this simplified persistence
descriptor doesn't work as expected.



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-to-Cassandra-change-from-1-9-to-2-0-tp13099p13160.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite to Cassandra change from 1.9 to 2.0

2017-05-24 Thread Kenan Dalley
I upgraded from Ignite 1.9 to Ignite 2.0 and this started happening.  The
stacktrace is below.  Also, I'm including the .sh script that I'm running so
you can see that the only difference between the executions is pointing to
v1.9 versus pointing to v2.0.  My code didn't change, but it fails in 2.0
and works in 1.9.  You'll find the "main" execution code at the bottom.


.sh

#export IGNITE_HOME=/app/ignite/apache-ignite-fabric-1.9.0-bin
export IGNITE_HOME=/app/ignite/apache-ignite-fabric-2.0.0-bin
export IGNITE_HOME_LIBS=${IGNITE_HOME}/libs
export
IGNITE_LIBS=${IGNITE_HOME_LIBS}/*:${IGNITE_HOME_LIBS}/ignite-spring/*:${IGNITE_HOME_LIBS}/ignite-indexing/*:${IGNITE_HOME_LIBS}/ignite-cassandra-store/*:${IGNITE_HOME_LIBS}/ignite-cassandra-serializers/*

java -cp iat2Kafka-0.0.1.jar:conf:libs/*:${IGNITE_LIBS}
com.test.TestCassandraPersistence &


Execution/StackTrace

[user1@host001 app1]$ ./run.sh
[user1@host001 app1]$ [09:43:25]__  
[09:43:25]   /  _/ ___/ |/ /  _/_  __/ __/
[09:43:25]  _/ // (7 7// /  / / / _/
[09:43:25] /___/\___/_/|_/___/ /_/ /___/
[09:43:25]
[09:43:25] ver. 2.0.0#20170430-sha1:d4eef3c6
[09:43:25] 2017 Copyright(C) Apache Software Foundation
[09:43:25]
[09:43:25] Ignite documentation: http://ignite.apache.org
[09:43:25]
[09:43:25] Quiet mode.
[09:43:25]   ^-- Logging to file
'/app/ignite/apache-ignite-fabric-2.0.0-bin/work/log/ignite-de853f44.0.log'
[09:43:25]   ^-- To see **FULL** console log here add -DIGNITE_QUIET=false
or "-v" to ignite.{sh|bat}
[09:43:25]
[09:43:25] OS: Linux 2.6.32-696.el6.x86_64 amd64
[09:43:25] VM information: Java(TM) SE Runtime Environment 1.8.0_131-b11
Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.131-b11
[09:43:25] Configured plugins:
[09:43:25]   ^-- None
[09:43:25]
[09:43:25] Message queue limit is set to 0 which may lead to potential OOMEs
when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to
message queues growth on sender and receiver sides.
[09:43:25] Security status [authentication=off, tls/ssl=off]
log4j:WARN No appenders could be found for logger
(org.springframework.beans.factory.support.DefaultListableBeanFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
more info.
[09:43:27] Performance suggestions for grid  (fix if possible)
[09:43:27] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
[09:43:27]   ^-- Enable G1 Garbage Collector (add '-XX:+UseG1GC' to JVM
options)
[09:43:27]   ^-- Specify JVM heap max size (add '-Xmx[g|G|m|M|k|K]' to
JVM options)
[09:43:27]   ^-- Set max direct memory size if getting 'OOME: Direct buffer
memory' (add '-XX:MaxDirectMemorySize=[g|G|m|M|k|K]' to JVM options)
[09:43:27]   ^-- Disable processing of calls to System.gc() (add
'-XX:+DisableExplicitGC' to JVM options)
[09:43:27]   ^-- Speed up flushing of dirty pages by OS (alter
vm.dirty_expire_centisecs parameter by setting to 500)
[09:43:27]   ^-- Reduce pages swapping ratio (set vm.swappiness=10)
[09:43:27]   ^-- Avoid direct reclaim and page allocation failures (set
vm.extra_free_kbytes=124)
[09:43:27]   ^-- Enable write-behind to persistent store (set
'writeBehindEnabled' to true)
[09:43:27] Refer to this page for more performance suggestions:
https://apacheignite.readme.io/docs/jvm-and-system-tuning
[09:43:27]
[09:43:27] To start Console Management & Monitoring run
ignitevisorcmd.{sh|bat}
[09:43:27]
[09:43:27] Ignite node started OK (id=de853f44)
[09:43:27] Topology snapshot [ver=1, servers=1, clients=0, CPUs=144,
heap=27.0GB]

>>> Cache store example started.
>>> Putting to C*.  Key: [HistoryResultKey = [key: key1, sessionId:
>>> sessionId1, algorithmName: algoName2]], Result: [HistoryResult = [key:
>>> key1, sessionId: sessionId1, sessionTime: 2017-05-24 09:43:27.292,
>>> algorithmName: algoName2, results: results-2017-05-24T09:43:27.298,
>>> analysisTime: 2017-05-24 09:43:27.298, createdDate: 2017-05-24
>>> 09:43:27.298, createdBy: creator, modifiedDate: 2017-05-24 09:43:27.298,
>>> modifiedBy: updater]]
[09:43:38,214][SEVERE][main][CassandraCacheStore] Failed to execute
Cassandra CQL statement: insert into "dev_qlty"."HistoryResult"
("algorithmname", "sessionid", "key", "analysistime", "createdby",
"createddate", "modifiedby", "modifieddate", "results", "sessiontime")
values (?,?,?,?,?,?,?,?,?,?) using ttl 2592000;
class org.apache.ignite.IgniteException: Failed to execute Cassandra CQL
statement: insert into "dev_qlty"."HistoryResult" ("algorithmname",
"sessionid", "key", "analysistime", "createdby", "createddate",
"modifiedby", &quo

Re: Ignite to Cassandra change from 1.9 to 2.0

2017-05-23 Thread Igor Rudyak
Could you please provide full exception stack trace? What do you mean by
upgraded to version 2.0? Is it Cassandra or Ignite version or something
else?



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-to-Cassandra-change-from-1-9-to-2-0-tp13099p13108.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Ignite to Cassandra change from 1.9 to 2.0

2017-05-23 Thread Kenan Dalley
I've just upgraded to the new version 2.0 from 1.8/1.9 with a Cassandra .xml
configuration as the backing data-store.  However, v2.0 is failing to pull
the column name that's tied to the field name in the persistence definition. 
Because of that, the CQL being generated is trying to call Cassandra using
the field name (e.g. sessionId) instead of the column name (e.g.
session_id).  I'm including my persistence.xml configurations that I've
tried: one using a pure POJO strategy definition and one using a POJO
strategy with the definition spelled out in the config file.  Neither
worked.

Error
[13:20:54,721][SEVERE][main][CassandraCacheStore] Failed to execute
Cassandra CQL statement: insert into "dev_keyspace"."HistoryResult"
("algorithmname", "sessionid", "key", "analysistime", "createdby",
"createddate", "modifiedby", "modifieddate", "results", "sessiontime")
values (?,?,?,?,?,?,?,?,?,?) using ttl 2592000;
class org.apache.ignite.IgniteException: Failed to execute Cassandra CQL
statement: insert into "dev_keyspace"."HistoryResult" ("algorithmname",
"sessionid", "key", "analysistime", "createdby", "createddate",
"modifiedby", "modifieddate", "results", "sessiontime") values
(?,?,?,?,?,?,?,?,?,?) using ttl 2592000;


Pure POJO Strategy Config

    
        comment = 'Test table for Ignite/Cassandra connection'
AND read_repair_chance = 0.2








POJO Strategy Full Definition Config


comment = 'Test table for Ignite/Cassandra connection'
AND read_repair_chance = 0.2


























HistoryResultKey
public class HistoryResultKey {
private String key;
private String sessionId;
private String algorithmName;

public HistoryResultKey() {
// No op.
}

public HistoryResultKey(final String key, final String sessionId, final
String algorithmName) {
this.key = key;
this.sessionId = sessionId;
this.algorithmName = algorithmName;
}

@AffinityKeyMapped
@QuerySqlField(index = true, groups = { "historyResultPK" })
public String getKey() {
return this.key;
}

@QuerySqlField(index = true, groups = { "historyResultPK" }, name =
"session_id")
public String getSessionId() {
return this.sessionId;
}

@QuerySqlField(index = true, groups = { "historyResultPK" }, name =
"algorithm_name")
public String getAlgorithmName() {
return this.algorithmName;
}
}


HistoryResult
public class HistoryResult {
private String key;
private String sessionId;
private Date sessionTime;
private String algorithmName;
private String results;
private Date analysisTime;
private Date createdDate;
private String createdBy;
private Date modifiedDate;
private String modifiedBy;

public MatlabHistoryResult() {
// no op
}

public MatlabHistoryResult(final String key, final String sessionId, 
final
Date sessionTime,
final String algorithmName, final String results, final 
Date
analysisTime, final Date createdDate,
final String createdBy, final Date modifiedDate, final 
String modifiedBy)
{
this.key = key;
this.sessionId = sessionId;
this.sessionTime = sessionTime;
this.algorithmName = algorithmName;
this.results = results;
this.analysisTime = analysisTime;
this.createdDate = createdDate;
this.createdBy = createdBy;
this.modifiedDate = modifiedDate;
this.modifiedBy = modifiedBy;
}

@QuerySqlField
public String getKey() {
return this.key;
}

@QuerySqlField(name = "session_id")
public String getSessionId() {
return this.sessionId;
}

@QuerySqlField(name = "session_time")
public Date getSessionTime() {
return this.sessionTime;
}

@QuerySqlField(name = "algorithm_name")
public String getAlgorithmName() {
return this.algorithmName;
}

@QueryTextField
public String getRe

Re: Newbie: Questions on Ignite over cassandra

2017-01-26 Thread vkulichenko
I meant that Cassandra itself will be involved only when you load the data
into caches, which is a separate step that should happen prior to query
execution. When Ignite query is executed, Cassandra is not touched.

The answer on your question is yes - any joins are possible, similar to any
relational database. However, for good performance you should consider
collocation and indexing. See the documentation for details:
https://apacheignite.readme.io/docs/sql-grid

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Newbie-Questions-on-Ignite-over-cassandra-tp10264p10274.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Newbie: Questions on Ignite over cassandra

2017-01-26 Thread vkulichenko
Hi,

Please properly subscribe to the mailing list so that the community can
receive email notifications for your messages. To subscribe, send empty
email to user-subscr...@ignite.apache.org and follow simple instructions in
the reply.


Jenny B. wrote
> I am exploring Apache Ignite on top of Cassandra as a possible tool to be
> able to give ad-hoc queries on cassandra tables. Using Ignite is it
> possible to able to search or query on any column in the underlying
> cassandra tables, like a RDBMS? Or can the join columns and search columns
> only be partition and clustering columns ?
> 
> If using Ignite, is there still need to create indexes on cassandra ? Also
> how does ignite treat materialized views ? Will there be a need to create
> materialized views ?
> 
> Also any insights into how updates to cassandra release can/will be
> handled by Ignite would be very helpful.

If you execute in-memory SQL queries using Ignite API, you have to load the
data from the store first:
https://apacheignite.readme.io/docs/data-loading#ignitecacheloadcache

Read-through works only for key based access. With queries you don't know
set of required keys in advance, thus only data which is already in memory
is used to execute them.

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Newbie-Questions-on-Ignite-over-cassandra-tp10264p10268.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite with Cassandra questions / errors

2017-01-05 Thread Kenan Dalley
Ah, now I understand.  

That worked.  Thanks!



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-questions-errors-tp9607p9913.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite with Cassandra questions / errors

2017-01-05 Thread Igor Rudyak

@QuerySqlField(name="analysis_time")
public Date getAnalysisTime() {
return analysisTime;
}

public void setAnalysisTime(Date analysisTime) {
this.analysisTime = analysisTime;
}

@QuerySqlField(name="created_dt")
public Date getCreatedDate() {
return createdDate;
}

public void setCreatedDate(Date createdDate) {
this.createdDate = createdDate;
}

@QuerySqlField(name="created_by")
public String getCreatedBy() {
return createdBy;
}

public void setCreatedBy(String createdBy) {
this.createdBy = createdBy;
}

@QuerySqlField(name="modified_dt")
public Date getModifiedDate() {
return modifiedDate;
}

public void setModifiedDate(Date modifiedDate) {
this.modifiedDate = modifiedDate;
}

@QuerySqlField(name="modified_by")
public String getModifiedBy() {
return modifiedBy;
}

public void setModifiedBy(String modifiedBy) {
this.modifiedBy = modifiedBy;
}

@Override
public boolean equals(Object o) {
if (this == o)
return true;

if (o == null)
return false;

if (!(o instanceof HistoryResult))
return false;

HistoryResult that = (HistoryResult)o;

if (vin != null ? !vin.equals(that.vin) : that.vin != null)
return false;

if (sessionId != null ? !sessionId.equals(that.sessionId) :
that.sessionId != null)
return false;

if (sessionTime != null ? !sessionTime.equals(that.sessionTime) :
that.sessionTime != null)
return false;

if (histName!= null ? !histName.equals(that.histName) :
that.histName!= null)
return false;

if (results != null ? !results.equals(that.results) : that.results
!= null)
return false;

if (analysisTime != null ? !analysisTime.equals(that.analysisTime) :
that.analysisTime != null)
return false;

if (createdDate != null ? !createdDate.equals(that.createdDate) :
that.createdDate != null)
return false;

if (createdBy != null ? !createdBy.equals(that.createdBy) :
that.createdBy != null)
return false;

if (modifiedDate != null ? !modifiedDate.equals(that.modifiedDate) :
that.modifiedDate != null)
return false;

if (modifiedBy != null ? !modifiedBy.equals(that.modifiedBy) :
that.modifiedBy != null)
return false;

return true;
}

@Override
public int hashCode() {
int res = vin != null ? vin.hashCode() : 0;
res = 31 * res + (sessionId != null ? sessionId.hashCode() : 0);
res = 31 * res + (sessionTime != null ? sessionTime.hashCode() : 0);
res = 31 * res + (histName!= null ? histName.hashCode() : 0);
res = 31 * res + (results != null ? results.hashCode() : 0);
res = 31 * res + (analysisTime != null ? analysisTime.hashCode() :
0);
res = 31 * res + (createdDate != null ? createdDate.hashCode() : 0);
res = 31 * res + (createdBy != null ? createdBy.hashCode() : 0);
res = 31 * res + (modifiedDate != null ? modifiedDate.hashCode() :
0);
res = 31 * res + (modifiedBy != null ? modifiedBy.hashCode() : 0);
return res;
}

@Override
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append("HistoryResult = [");
sb.append("vin: ");
sb.append(vin);
sb.append(", sessionId: ");
sb.append(sessionId);
sb.append(", sessionTime: ");
sb.append(sessionTime);
sb.append(", histName: ");
sb.append(histName);
sb.append(", results: ");
sb.append(results);
sb.append(", analysisTime: ");
sb.append(analysisTime);
sb.append(", createdDate: ");
sb.append(createdDate);
sb.append(", createdBy: ");
sb.append(createdBy);
sb.append(", modifiedDate: ");
sb.append(modifiedDate);
sb.append(", modifiedBy: ");
sb.append(modifiedBy);
sb.append("]");
return sb.toString();
    }
}


For these two classes you Cassandra persistence descriptor could be as
simple as:

*

comment = 'Test table for Ignite/Cassandra connection'
AND read_repair_chance = 0.2




*



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-questions-errors-tp9607p9909.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite with Cassandra questions / errors

2017-01-05 Thread Kenan Dalley
   public String getHistName() {
return histName;
}

public void setHistName(String histName) {
this.histName= histName;
}

public String getResults() {
return results;
}

public void setResults(String results) {
this.results = results;
}

public Date getAnalysisTime() {
return analysisTime;
}

public void setAnalysisTime(Date analysisTime) {
this.analysisTime = analysisTime;
}

public Date getCreatedDate() {
return createdDate;
}

public void setCreatedDate(Date createdDate) {
this.createdDate = createdDate;
}

public String getCreatedBy() {
return createdBy;
}

public void setCreatedBy(String createdBy) {
this.createdBy = createdBy;
}

public Date getModifiedDate() {
return modifiedDate;
}

public void setModifiedDate(Date modifiedDate) {
this.modifiedDate = modifiedDate;
}

public String getModifiedBy() {
return modifiedBy;
}

public void setModifiedBy(String modifiedBy) {
this.modifiedBy = modifiedBy;
}

@Override
public boolean equals(Object o) {
if (this == o)
return true;

if (o == null)
return false;

if (!(o instanceof HistoryResult))
return false;

HistoryResult that = (HistoryResult)o;

if (vin != null ? !vin.equals(that.vin) : that.vin != null)
return false;

if (sessionId != null ? !sessionId.equals(that.sessionId) :
that.sessionId != null)
return false;

if (sessionTime != null ? !sessionTime.equals(that.sessionTime) :
that.sessionTime != null)
return false;

if (histName!= null ? !histName.equals(that.histName) :
that.histName!= null)
return false;

if (results != null ? !results.equals(that.results) : that.results
!= null)
return false;

if (analysisTime != null ? !analysisTime.equals(that.analysisTime) :
that.analysisTime != null)
return false;

if (createdDate != null ? !createdDate.equals(that.createdDate) :
that.createdDate != null)
return false;

if (createdBy != null ? !createdBy.equals(that.createdBy) :
that.createdBy != null)
return false;

if (modifiedDate != null ? !modifiedDate.equals(that.modifiedDate) :
that.modifiedDate != null)
return false;

if (modifiedBy != null ? !modifiedBy.equals(that.modifiedBy) :
that.modifiedBy != null)
return false;

return true;
}

@Override
public int hashCode() {
int res = vin != null ? vin.hashCode() : 0;
res = 31 * res + (sessionId != null ? sessionId.hashCode() : 0);
res = 31 * res + (sessionTime != null ? sessionTime.hashCode() : 0);
res = 31 * res + (histName!= null ? histName.hashCode() : 0);
res = 31 * res + (results != null ? results.hashCode() : 0);
res = 31 * res + (analysisTime != null ? analysisTime.hashCode() :
0);
res = 31 * res + (createdDate != null ? createdDate.hashCode() : 0);
res = 31 * res + (createdBy != null ? createdBy.hashCode() : 0);
res = 31 * res + (modifiedDate != null ? modifiedDate.hashCode() :
0);
res = 31 * res + (modifiedBy != null ? modifiedBy.hashCode() : 0);
return res;
}

@Override
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append("HistoryResult = [");
sb.append("vin: ");
sb.append(vin);
sb.append(", sessionId: ");
sb.append(sessionId);
sb.append(", sessionTime: ");
sb.append(sessionTime);
sb.append(", histName: ");
sb.append(histName);
sb.append(", results: ");
sb.append(results);
sb.append(", analysisTime: ");
sb.append(analysisTime);
sb.append(", createdDate: ");
sb.append(createdDate);
sb.append(", createdBy: ");
sb.append(createdBy);
sb.append(", modifiedDate: ");
sb.append(modifiedDate);
sb.append(", modifiedBy: ");
sb.append(modifiedBy);
sb.append("]");
return sb.toString();
}
}





--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-questions-errors-tp9607p9900.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite with Cassandra questions / errors

2017-01-04 Thread Igor Rudyak
Ok, 

I took a look at the *HistoryResult* implementation once again and found the
reason. Your java class should follow  JavaBeans Conventions
<http://docstore.mik.ua/orelly/java-ent/jnut/ch06_02.htm>  . The most
important here is that you class should implement getters/setters methods
for READ/WRITE properties or getters for READ-ONLY properties.

In your case you just have a class with *private* members annotated by
*@QuerySqlField* - this will not work. You should implement getters/setters
methods for these *private* fields and then annotate getter or setter with
*@QuerySqlField*



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-questions-errors-tp9607p9887.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite with Cassandra questions / errors

2017-01-04 Thread Kenan Dalley
Ok, so I got my code working reading/inserting/updating from/to C*, but I had
to manually set up the persistence.xml as below and change my
java.sql.Timestamp types in the class to java.util.Date types.




comment = 'Test table for Ignite/Cassandra connection'
AND read_repair_chance = 0.2




























--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-questions-errors-tp9607p9877.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite with Cassandra questions / errors

2017-01-04 Thread Kenan Dalley
This is what I remember reading in the docs as well.  However, I just ran the
DDLGenerator using my Cassandra-persistence-settings.xml file and the
following is what it generated.  So, either I don't have something set up
correctly for Ignite to recognize the Annotations, or there's a problem in
the DDLGenerator such that the Annotations are being ignored by it.  I'm
perfectly willing to accept that I've done something wrong.  


-
DDL for keyspace/table from file:
resources/cassandra-persistence-settings.xml
-

create keyspace if not exists "dev_qlty"
with replication = {'class' : 'SimpleStrategy', 'replication_factor' : 3}
and durable_writes = true;

create table if not exists "dev_qlty"."HistoryResult"
(
 "algorithmname" text,
 "sessionid" text,
 "vin" text,
 "createdby" text,
 "modifiedby" text,
 "results" text,
 primary key (("algorithmname", "sessionid", "vin"))
) 
with comment = 'Test table for Ignite/Cassandra connection' AND
read_repair_chance = 0.2;





--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-questions-errors-tp9607p9875.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite with Cassandra questions / errors

2016-12-16 Thread Igor Rudyak
Hi,

This is not actually 100% true - Cassandra integration supports
@QuerySqlField annotations to create tables and doing mapping between
object fields and table columns.

Kenan, have you tried Cassandra DDL generator
https://apacheignite-mix.readme.io/docs/ddl-generator for your persistence
descriptor? It will generate DDL for your table - such way you can check if
you missed something.

Igor



On Fri, Dec 16, 2016 at 11:01 AM, vkulichenko <valentin.kuliche...@gmail.com
> wrote:

> Hi,
>
> @QuerySqlField is annotation for Ignite SQL [1], it has nothing to do with
> Cassandra integration. To specify column name which differs from Java field
> name, you should use 'field' tags inside 'valuePersistence', like shown in
> the example [2].
>
> [1]
> https://apacheignite.readme.io/v1.8/docs/indexes#
> annotation-based-configuration
> [2] https://apacheignite-mix.readme.io/docs/examples#example-5
>
> -Val
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Ignite-with-Cassandra-questions-
> errors-tp9607p9608.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>


Re: Ignite with Cassandra questions / errors

2016-12-16 Thread vkulichenko
Hi,

@QuerySqlField is annotation for Ignite SQL [1], it has nothing to do with
Cassandra integration. To specify column name which differs from Java field
name, you should use 'field' tags inside 'valuePersistence', like shown in
the example [2].

[1]
https://apacheignite.readme.io/v1.8/docs/indexes#annotation-based-configuration
[2] https://apacheignite-mix.readme.io/docs/examples#example-5

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-questions-errors-tp9607p9608.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Ignite with Cassandra questions / errors

2016-12-16 Thread Kenan Dalley
Hi.  I have 2 questions regarding Ignite & Cassandra.

I'm using Ignite v1.8

I'm trying to get a very simple example working with read/write through to
c*, but I'm having some difficulty.

First, I'm trying to use the POJO strategy of configuration for both the key
& value persistence with the class fieldnames slightly different than the
column names, but defined in the @QuerySqlField(name="blah") annotation of
the POJO.  However, when it runs, I'm getting a "IgniteException: Failed to
prepare Cassandra CQL statement: insert into .
" and all of the column names are the lowercase names of the class field and
not what is defined in the annotation.  What's my problem? 
(configuration/class info to follow at the bottom)

Second, none of my "java.sql.Timestamp" fields are showing up in the insert
statement generated in the exception.  My c* columns are defined as
"timestamp" fields and what I've seen seems to indicate that those should be
"java.sql.Timestamp" datatypes, but it's not working.  How do I get my
Timestamp fields to be recognized?


===
Ignite Exception:

[Mule] 2016-12-16 09:13:21,901  WARN 
-com.datastax.driver.core.ReplicationStrategy$NetworkTopologyStrategy.computeTokenToReplicaMap(ReplicationStategy.java:198)
 
- Error while computing token map for keyspace dev_cirad with datacenter
dc1: could not achieve replication factor 3 (found 0 replicas only), check
your keyspace replication settings.
[09:13:22,324][SEVERE][main][CassandraCacheStore] Failed to execute
Cassandra CQL statement: insert into "mykeyspace"."HistoryResult"
("histname", "sessionid", "vin", "createdby", "modifiedby", "results")
values (?,?,?,?,?,?) using ttl 2592000;
class org.apache.ignite.IgniteException: Failed to execute Cassandra CQL
statement: insert into "mykeyspace"."HistoryResult" ("histname",
"sessionid", "vin", "createdby", "modifiedby", "results") values
(?,?,?,?,?,?) using ttl 2592000;
at
org.apache.ignite.cache.store.cassandra.session.CassandraSessionImpl.execute(CassandraSessionImpl.java:163)

Caused by: class org.apache.ignite.IgniteException: Failed to prepare
Cassandra CQL statement: insert into "mykeyspace"."HistoryResult"
("histname", "sessionid", "vin", "createdby", "modifiedby", "results")
values (?,?,?,?,?,?) using ttl 2592000;
at
org.apache.ignite.cache.store.cassandra.session.CassandraSessionImpl.prepareStatement(CassandraSessionImpl.java:615)
at
org.apache.ignite.cache.store.cassandra.session.CassandraSessionImpl.execute(CassandraSessionImpl.java:133)
... 20 more
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException:
Unknown identifier histname
at
com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:50)
at
com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at
com.datastax.driver.core.AbstractSession.prepare(AbstractSession.java:98)
at
org.apache.ignite.cache.store.cassandra.session.CassandraSessionImpl.prepareStatement(CassandraSessionImpl.java:597)
... 21 more

===
cassandra-ignite.xml

http://www.springframework.org/schema/beans;
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
   xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd;>
























127.0.0.1:47500..47509





























===
cassandra-connection-settings.xml



http://www.springframework.org/schema/beans;
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
   xmlns:util="http://www.springframework.org/schema/util;
   xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/util
http://www.s

Re: Ignite with cassandra

2016-11-10 Thread Dmitriy Govorukhin
Hi, i think it problem related to cassandra configuration or network, check
you firewall. You can try change cassandra configuration and enable port
9160 for using. Also check which version cassandra-jdbc driver using,
different version use different ports.



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-cassandra-tp8777p8881.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: spark SQL thriftserver over ignite and cassandra

2016-10-25 Thread Igor Sapego
>>>> It's a security issue, Ignite cache doesn't provide multiple user
>>>> account per cache. I am thinking of using Spark to authenticate multiple
>>>> users and then Spark use a shared account on Ignite cache
>>>>
>>>>
>>>> Basically, Ignite provides basic security interfaces and some
>>>> implementations which you can rely on by building your secure solution.
>>>> This article can be useful for your case
>>>> http://smartkey.co.uk/development/securing-an-apache-ignite-cluster/
>>>>
>>>> —
>>>> Denis
>>>>
>>>>
>>>>> If you need a real multi-tenancy support where cacheA is allowed to be
>>>>> accessed by a group of users A only and cacheB by users from group B then
>>>>> you can take a look at GridGain which is built on top of Ignite
>>>>> https://gridgain.readme.io/docs/multi-tenancy
>>>>>
>>>>>
>>>>>
>>>> OK but I am evaluating open source only solutions (kylin, druid,
>>>> alluxio...), it's a constraint from my hierarchy
>>>>
>>>>>
>>>>> What I want to achieve is :
>>>>> - use Cassandra for data store as it provides idempotence (HDFS/hive
>>>>> doesn't), resulting in exactly once semantic without any duplicates.
>>>>> - use Spark SQL thriftserver in multi tenancy for large scale adhoc
>>>>> analytics queries (> TB) from an ODBC driver through HTTP(S)
>>>>> - accelerate Cassandra reads when the data modeling of the Cassandra
>>>>> table doesn't fit the queries. Queries would be OLAP style: target 
>>>>> multiple
>>>>> C* partitions, groupby or filters on lots of dimensions that aren't
>>>>> necessarely in the C* table key.
>>>>>
>>>>>
>>>>> As it was mentioned Ignite uses Cassandra as a CacheStore. You should
>>>>> keep this in mind. Before trying to assemble all the chain I would
>>>>> recommend you trying to connect Spark SQL thrift server directly to Ignite
>>>>> and work with its shared RDDs [1]. A shared RDD (basically Ignite cache)
>>>>> can be backed by Cassandra. Probably this chain will work for you but I
>>>>> can’t give more precise guidance on this.
>>>>>
>>>>>
>>>> I will try to make it works and give you feedback
>>>>
>>>>
>>>>
>>>>> [1] https://apacheignite-fs.readme.io/docs/ignite-for-spark
>>>>>
>>>>> —
>>>>> Denis
>>>>>
>>>>> Thanks for your advises
>>>>>
>>>>>
>>>>> 2016-10-04 6:51 GMT+02:00 Jörn Franke <jornfra...@gmail.com>:
>>>>>
>>>>>> I am not sure that this will be performant. What do you want to
>>>>>> achieve here? Fast lookups? Then the Cassandra Ignite store might be the
>>>>>> right solution. If you want to do more analytic style of queries then you
>>>>>> can put the data on HDFS/Hive and use the Ignite HDFS cache to cache
>>>>>> certain partitions/tables in Hive in-memory. If you want to go to 
>>>>>> iterative
>>>>>> machine learning algorithms you can go for Spark on top of this. You can
>>>>>> use then also Ignite cache for Spark RDDs.
>>>>>>
>>>>>> On 4 Oct 2016, at 02:24, Alexey Kuznetsov <akuznet...@gridgain.com>
>>>>>> wrote:
>>>>>>
>>>>>> Hi, Vincent!
>>>>>>
>>>>>> Ignite also has SQL support (also scalable), I think it will be much
>>>>>> faster to query directly from Ignite than query from Spark.
>>>>>> Also please mind, that before executing queries you should load all
>>>>>> needed data to cache.
>>>>>> To load data from Cassandra to Ignite you may use Cassandra store [1].
>>>>>>
>>>>>> [1] https://apacheignite.readme.io/docs/ignite-with-apache-cassandra
>>>>>>
>>>>>> On Tue, Oct 4, 2016 at 4:19 AM, vincent gromakowski <
>>>>>> vincent.gromakow...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>> I am evaluating the possibility to use Spark SQL (and its
>>>>>>> scalability) over an Ignite cache with Cassandra persistent store to
>>>>>>> increase read workloads like OLAP style analytics.
>>>>>>> Is there any way to configure Spark thriftserver to load an external
>>>>>>> table in Ignite like we can do in Cassandra ?
>>>>>>> Here is an example of config for spark backed by cassandra
>>>>>>>
>>>>>>> CREATE EXTERNAL TABLE MyHiveTable
>>>>>>> ( id int, data string )
>>>>>>> STORED BY 'org.apache.hadoop.hive.cassan
>>>>>>> dra.cql.CqlStorageHandler'
>>>>>>> TBLPROPERTIES ("cassandra.host" = "x.x.x.x", "
>>>>>>> cassandra.ks.name" = "test" ,
>>>>>>>   "cassandra.cf.name" = "mytable" ,
>>>>>>>   "cassandra.ks.repfactor" = "1" ,
>>>>>>>   "cassandra.ks.strategy" =
>>>>>>> "org.apache.cassandra.locator.SimpleStrategy" );
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Alexey Kuznetsov
>>>>>>
>>>>>>
>>>>
>>
>


Re: spark SQL thriftserver over ignite and cassandra

2016-10-17 Thread vincent gromakowski
Hi
I mean using HTTPS transport instead of binary (thrift?) transport.

2016-10-17 19:10 GMT+02:00 Igor Sapego <isap...@gridgain.com>:

> Hi Vincent,
>
> Can you please explain what do you mean by HTTP(S) support for the ODBC?
>
> I'm not quite sure I get it.
>
> Best Regards,
> Igor
>
> On Thu, Oct 6, 2016 at 9:59 AM, vincent gromakowski <
> vincent.gromakow...@gmail.com> wrote:
>
>> Thanks
>>
>> Starting the thriftserver with igniterdd tables doesn't seem very hard.
>> Implementing a security layer over ignite cache may be harder as I need to:
>> - get username from thriftserver
>> - intercept each request and check permissions
>> Maybe spark will also be able to handle permissions...
>>
>> I will keep you informed
>>
>> Le 6 oct. 2016 00:12, "Denis Magda" <dma...@gridgain.com> a écrit :
>>
>>> Vincent,
>>>
>>> Please see below
>>>
>>> On Oct 5, 2016, at 4:31 AM, vincent gromakowski <
>>> vincent.gromakow...@gmail.com> wrote:
>>>
>>> Hi
>>> thanks for your explanations. Please find inline more questions
>>>
>>> Vincent
>>>
>>> 2016-10-05 3:33 GMT+02:00 Denis Magda <dma...@gridgain.com>:
>>>
>>>> Hi Vincent,
>>>>
>>>> See my answers inline
>>>>
>>>> On Oct 4, 2016, at 12:54 AM, vincent gromakowski <
>>>> vincent.gromakow...@gmail.com> wrote:
>>>>
>>>> Hi,
>>>> I know that Ignite has SQL support but:
>>>> - ODBC driver doesn't seem to provide HTTP(S) support, which is easier
>>>> to integrate on corporate networks with rules, firewalls, proxies
>>>>
>>>>
>>>> *Igor Sapego*, what URIs are supported presently?
>>>>
>>>> - The SQL engine doesn't seem to scale like Spark SQL would. For
>>>> instance, Spark won't generate OOM is dataset (source or result) doesn't
>>>> fit in memory. From Ignite side, it's not clear…
>>>>
>>>>
>>>> OOM is not related to scalability topic at all. This is about
>>>> application’s logic.
>>>>
>>>> Ignite SQL engine perfectly scales out along with your cluster.
>>>> Moreover, Ignite supports indexes which allows you to get O(logN) running
>>>> time complexity for your SQL queries while in case of Spark you will face
>>>> with full-scans (O(N)) all the time.
>>>>
>>>> However, to benefit from Ignite SQL queries you have to put all the
>>>> data in-memory. Ignite doesn’t go to a CacheStore (Cassandra, relational
>>>> database, MongoDB, etc) while a SQL query is executed and won’t preload
>>>> anything from an underlying CacheStore. Automatic preloading works for
>>>> key-value queries like cache.get(key).
>>>>
>>>
>>>
>>> This is an issue because I will potentially have to query TB of data. If
>>> I use Spark thriftserver backed by IgniteRDD, does it solve this point and
>>> can I get automatic preloading from C* ?
>>>
>>>
>>> IgniteRDD will load missing tuples (key-value) pair from Cassandra
>>> because essentially IgniteRDD is an IgniteCache and Cassandra is a
>>> CacheStore. The only thing that is left to check is whether Spark
>>> triftserver can work with IgniteRDDs. Hope you will be able figure out this
>>> and share your feedback with us.
>>>
>>>
>>>
>>>> - Spark thrift can manage multi tenancy: different users can connect to
>>>> the same SQL engine and share cache. In Ignite it's one cache per user, so
>>>> a big waste of RAM.
>>>>
>>>>
>>>> Everyone can connect to an Ignite cluster and work with the same set of
>>>> distributed caches. I’m not sure why you need to create caches with the
>>>> same content for every user.
>>>>
>>>
>>> It's a security issue, Ignite cache doesn't provide multiple user
>>> account per cache. I am thinking of using Spark to authenticate multiple
>>> users and then Spark use a shared account on Ignite cache
>>>
>>>
>>> Basically, Ignite provides basic security interfaces and some
>>> implementations which you can rely on by building your secure solution.
>>> This article can be useful for your case
>>> http://smartkey.co.uk/development/securing-an-apache-ignite-cluster/
>>>
>>> —
>>> Denis
>>>
>>&g

Re: spark SQL thriftserver over ignite and cassandra

2016-10-17 Thread Igor Sapego
Hi Vincent,

Can you please explain what do you mean by HTTP(S) support for the ODBC?

I'm not quite sure I get it.

Best Regards,
Igor

On Thu, Oct 6, 2016 at 9:59 AM, vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> Thanks
>
> Starting the thriftserver with igniterdd tables doesn't seem very hard.
> Implementing a security layer over ignite cache may be harder as I need to:
> - get username from thriftserver
> - intercept each request and check permissions
> Maybe spark will also be able to handle permissions...
>
> I will keep you informed
>
> Le 6 oct. 2016 00:12, "Denis Magda" <dma...@gridgain.com> a écrit :
>
>> Vincent,
>>
>> Please see below
>>
>> On Oct 5, 2016, at 4:31 AM, vincent gromakowski <
>> vincent.gromakow...@gmail.com> wrote:
>>
>> Hi
>> thanks for your explanations. Please find inline more questions
>>
>> Vincent
>>
>> 2016-10-05 3:33 GMT+02:00 Denis Magda <dma...@gridgain.com>:
>>
>>> Hi Vincent,
>>>
>>> See my answers inline
>>>
>>> On Oct 4, 2016, at 12:54 AM, vincent gromakowski <
>>> vincent.gromakow...@gmail.com> wrote:
>>>
>>> Hi,
>>> I know that Ignite has SQL support but:
>>> - ODBC driver doesn't seem to provide HTTP(S) support, which is easier
>>> to integrate on corporate networks with rules, firewalls, proxies
>>>
>>>
>>> *Igor Sapego*, what URIs are supported presently?
>>>
>>> - The SQL engine doesn't seem to scale like Spark SQL would. For
>>> instance, Spark won't generate OOM is dataset (source or result) doesn't
>>> fit in memory. From Ignite side, it's not clear…
>>>
>>>
>>> OOM is not related to scalability topic at all. This is about
>>> application’s logic.
>>>
>>> Ignite SQL engine perfectly scales out along with your cluster.
>>> Moreover, Ignite supports indexes which allows you to get O(logN) running
>>> time complexity for your SQL queries while in case of Spark you will face
>>> with full-scans (O(N)) all the time.
>>>
>>> However, to benefit from Ignite SQL queries you have to put all the data
>>> in-memory. Ignite doesn’t go to a CacheStore (Cassandra, relational
>>> database, MongoDB, etc) while a SQL query is executed and won’t preload
>>> anything from an underlying CacheStore. Automatic preloading works for
>>> key-value queries like cache.get(key).
>>>
>>
>>
>> This is an issue because I will potentially have to query TB of data. If
>> I use Spark thriftserver backed by IgniteRDD, does it solve this point and
>> can I get automatic preloading from C* ?
>>
>>
>> IgniteRDD will load missing tuples (key-value) pair from Cassandra
>> because essentially IgniteRDD is an IgniteCache and Cassandra is a
>> CacheStore. The only thing that is left to check is whether Spark
>> triftserver can work with IgniteRDDs. Hope you will be able figure out this
>> and share your feedback with us.
>>
>>
>>
>>> - Spark thrift can manage multi tenancy: different users can connect to
>>> the same SQL engine and share cache. In Ignite it's one cache per user, so
>>> a big waste of RAM.
>>>
>>>
>>> Everyone can connect to an Ignite cluster and work with the same set of
>>> distributed caches. I’m not sure why you need to create caches with the
>>> same content for every user.
>>>
>>
>> It's a security issue, Ignite cache doesn't provide multiple user account
>> per cache. I am thinking of using Spark to authenticate multiple users and
>> then Spark use a shared account on Ignite cache
>>
>>
>> Basically, Ignite provides basic security interfaces and some
>> implementations which you can rely on by building your secure solution.
>> This article can be useful for your case
>> http://smartkey.co.uk/development/securing-an-apache-ignite-cluster/
>>
>> —
>> Denis
>>
>>
>>> If you need a real multi-tenancy support where cacheA is allowed to be
>>> accessed by a group of users A only and cacheB by users from group B then
>>> you can take a look at GridGain which is built on top of Ignite
>>> https://gridgain.readme.io/docs/multi-tenancy
>>>
>>>
>>>
>> OK but I am evaluating open source only solutions (kylin, druid,
>> alluxio...), it's a constraint from my hierarchy
>>
>>>
>>> What I want to achieve is :
>>> - use Cassandra for data store a

Re: spark SQL thriftserver over ignite and cassandra

2016-10-06 Thread vincent gromakowski
Thanks

Starting the thriftserver with igniterdd tables doesn't seem very hard.
Implementing a security layer over ignite cache may be harder as I need to:
- get username from thriftserver
- intercept each request and check permissions
Maybe spark will also be able to handle permissions...

I will keep you informed

Le 6 oct. 2016 00:12, "Denis Magda" <dma...@gridgain.com> a écrit :

> Vincent,
>
> Please see below
>
> On Oct 5, 2016, at 4:31 AM, vincent gromakowski <
> vincent.gromakow...@gmail.com> wrote:
>
> Hi
> thanks for your explanations. Please find inline more questions
>
> Vincent
>
> 2016-10-05 3:33 GMT+02:00 Denis Magda <dma...@gridgain.com>:
>
>> Hi Vincent,
>>
>> See my answers inline
>>
>> On Oct 4, 2016, at 12:54 AM, vincent gromakowski <
>> vincent.gromakow...@gmail.com> wrote:
>>
>> Hi,
>> I know that Ignite has SQL support but:
>> - ODBC driver doesn't seem to provide HTTP(S) support, which is easier to
>> integrate on corporate networks with rules, firewalls, proxies
>>
>>
>> *Igor Sapego*, what URIs are supported presently?
>>
>> - The SQL engine doesn't seem to scale like Spark SQL would. For
>> instance, Spark won't generate OOM is dataset (source or result) doesn't
>> fit in memory. From Ignite side, it's not clear…
>>
>>
>> OOM is not related to scalability topic at all. This is about
>> application’s logic.
>>
>> Ignite SQL engine perfectly scales out along with your cluster. Moreover,
>> Ignite supports indexes which allows you to get O(logN) running time
>> complexity for your SQL queries while in case of Spark you will face with
>> full-scans (O(N)) all the time.
>>
>> However, to benefit from Ignite SQL queries you have to put all the data
>> in-memory. Ignite doesn’t go to a CacheStore (Cassandra, relational
>> database, MongoDB, etc) while a SQL query is executed and won’t preload
>> anything from an underlying CacheStore. Automatic preloading works for
>> key-value queries like cache.get(key).
>>
>
>
> This is an issue because I will potentially have to query TB of data. If I
> use Spark thriftserver backed by IgniteRDD, does it solve this point and
> can I get automatic preloading from C* ?
>
>
> IgniteRDD will load missing tuples (key-value) pair from Cassandra because
> essentially IgniteRDD is an IgniteCache and Cassandra is a CacheStore. The
> only thing that is left to check is whether Spark triftserver can work with
> IgniteRDDs. Hope you will be able figure out this and share your feedback
> with us.
>
>
>
>> - Spark thrift can manage multi tenancy: different users can connect to
>> the same SQL engine and share cache. In Ignite it's one cache per user, so
>> a big waste of RAM.
>>
>>
>> Everyone can connect to an Ignite cluster and work with the same set of
>> distributed caches. I’m not sure why you need to create caches with the
>> same content for every user.
>>
>
> It's a security issue, Ignite cache doesn't provide multiple user account
> per cache. I am thinking of using Spark to authenticate multiple users and
> then Spark use a shared account on Ignite cache
>
>
> Basically, Ignite provides basic security interfaces and some
> implementations which you can rely on by building your secure solution.
> This article can be useful for your case
> http://smartkey.co.uk/development/securing-an-apache-ignite-cluster/
>
> —
> Denis
>
>
>> If you need a real multi-tenancy support where cacheA is allowed to be
>> accessed by a group of users A only and cacheB by users from group B then
>> you can take a look at GridGain which is built on top of Ignite
>> https://gridgain.readme.io/docs/multi-tenancy
>>
>>
>>
> OK but I am evaluating open source only solutions (kylin, druid,
> alluxio...), it's a constraint from my hierarchy
>
>>
>> What I want to achieve is :
>> - use Cassandra for data store as it provides idempotence (HDFS/hive
>> doesn't), resulting in exactly once semantic without any duplicates.
>> - use Spark SQL thriftserver in multi tenancy for large scale adhoc
>> analytics queries (> TB) from an ODBC driver through HTTP(S)
>> - accelerate Cassandra reads when the data modeling of the Cassandra
>> table doesn't fit the queries. Queries would be OLAP style: target multiple
>> C* partitions, groupby or filters on lots of dimensions that aren't
>> necessarely in the C* table key.
>>
>>
>> As it was mentioned Ignite uses Cassandra as a CacheStore. You should
>> keep this in mind. Before trying to assemble

Re: spark SQL thriftserver over ignite and cassandra

2016-10-05 Thread Denis Magda
Vincent,

Please see below

> On Oct 5, 2016, at 4:31 AM, vincent gromakowski 
> <vincent.gromakow...@gmail.com> wrote:
> 
> Hi
> thanks for your explanations. Please find inline more questions 
> 
> Vincent
> 
> 2016-10-05 3:33 GMT+02:00 Denis Magda <dma...@gridgain.com 
> <mailto:dma...@gridgain.com>>:
> Hi Vincent,
> 
> See my answers inline
> 
>> On Oct 4, 2016, at 12:54 AM, vincent gromakowski 
>> <vincent.gromakow...@gmail.com <mailto:vincent.gromakow...@gmail.com>> wrote:
>> 
>> Hi,
>> I know that Ignite has SQL support but:
>> - ODBC driver doesn't seem to provide HTTP(S) support, which is easier to 
>> integrate on corporate networks with rules, firewalls, proxies
> 
> Igor Sapego, what URIs are supported presently? 
> 
>> - The SQL engine doesn't seem to scale like Spark SQL would. For instance, 
>> Spark won't generate OOM is dataset (source or result) doesn't fit in 
>> memory. From Ignite side, it's not clear…
> 
> OOM is not related to scalability topic at all. This is about application’s 
> logic. 
> 
> Ignite SQL engine perfectly scales out along with your cluster. Moreover, 
> Ignite supports indexes which allows you to get O(logN) running time 
> complexity for your SQL queries while in case of Spark you will face with 
> full-scans (O(N)) all the time.
> 
> However, to benefit from Ignite SQL queries you have to put all the data 
> in-memory. Ignite doesn’t go to a CacheStore (Cassandra, relational database, 
> MongoDB, etc) while a SQL query is executed and won’t preload anything from 
> an underlying CacheStore. Automatic preloading works for key-value queries 
> like cache.get(key).
> 
> 
> This is an issue because I will potentially have to query TB of data. If I 
> use Spark thriftserver backed by IgniteRDD, does it solve this point and can 
> I get automatic preloading from C* ?

IgniteRDD will load missing tuples (key-value) pair from Cassandra because 
essentially IgniteRDD is an IgniteCache and Cassandra is a CacheStore. The only 
thing that is left to check is whether Spark triftserver can work with 
IgniteRDDs. Hope you will be able figure out this and share your feedback with 
us.


> 
>> - Spark thrift can manage multi tenancy: different users can connect to the 
>> same SQL engine and share cache. In Ignite it's one cache per user, so a big 
>> waste of RAM.
> 
> Everyone can connect to an Ignite cluster and work with the same set of 
> distributed caches. I’m not sure why you need to create caches with the same 
> content for every user.
> 
> It's a security issue, Ignite cache doesn't provide multiple user account per 
> cache. I am thinking of using Spark to authenticate multiple users and then 
> Spark use a shared account on Ignite cache
>  
Basically, Ignite provides basic security interfaces and some implementations 
which you can rely on by building your secure solution. This article can be 
useful for your case
http://smartkey.co.uk/development/securing-an-apache-ignite-cluster/ 
<http://smartkey.co.uk/development/securing-an-apache-ignite-cluster/>

—
Denis

> 
> If you need a real multi-tenancy support where cacheA is allowed to be 
> accessed by a group of users A only and cacheB by users from group B then you 
> can take a look at GridGain which is built on top of Ignite
> https://gridgain.readme.io/docs/multi-tenancy 
> <https://gridgain.readme.io/docs/multi-tenancy>
> 
> 
> 
> OK but I am evaluating open source only solutions (kylin, druid, alluxio...), 
> it's a constraint from my hierarchy
>> 
>> What I want to achieve is :
>> - use Cassandra for data store as it provides idempotence (HDFS/hive 
>> doesn't), resulting in exactly once semantic without any duplicates. 
>> - use Spark SQL thriftserver in multi tenancy for large scale adhoc 
>> analytics queries (> TB) from an ODBC driver through HTTP(S) 
>> - accelerate Cassandra reads when the data modeling of the Cassandra table 
>> doesn't fit the queries. Queries would be OLAP style: target multiple C* 
>> partitions, groupby or filters on lots of dimensions that aren't necessarely 
>> in the C* table key.
>> 
> 
> As it was mentioned Ignite uses Cassandra as a CacheStore. You should keep 
> this in mind. Before trying to assemble all the chain I would recommend you 
> trying to connect Spark SQL thrift server directly to Ignite and work with 
> its shared RDDs [1]. A shared RDD (basically Ignite cache) can be backed by 
> Cassandra. Probably this chain will work for you but I can’t give more 
> precise guidance on this.
> 
> 
> I will try to make it works and give you feedback
> 
>  
> [1] https://apacheignite

Re: spark SQL thriftserver over ignite and cassandra

2016-10-05 Thread vincent gromakowski
Hi
thanks for your explanations. Please find inline more questions

Vincent

2016-10-05 3:33 GMT+02:00 Denis Magda <dma...@gridgain.com>:

> Hi Vincent,
>
> See my answers inline
>
> On Oct 4, 2016, at 12:54 AM, vincent gromakowski <
> vincent.gromakow...@gmail.com> wrote:
>
> Hi,
> I know that Ignite has SQL support but:
> - ODBC driver doesn't seem to provide HTTP(S) support, which is easier to
> integrate on corporate networks with rules, firewalls, proxies
>
>
> *Igor Sapego*, what URIs are supported presently?
>
> - The SQL engine doesn't seem to scale like Spark SQL would. For instance,
> Spark won't generate OOM is dataset (source or result) doesn't fit in
> memory. From Ignite side, it's not clear…
>
>
> OOM is not related to scalability topic at all. This is about
> application’s logic.
>
> Ignite SQL engine perfectly scales out along with your cluster. Moreover,
> Ignite supports indexes which allows you to get O(logN) running time
> complexity for your SQL queries while in case of Spark you will face with
> full-scans (O(N)) all the time.
>
> However, to benefit from Ignite SQL queries you have to put all the data
> in-memory. Ignite doesn’t go to a CacheStore (Cassandra, relational
> database, MongoDB, etc) while a SQL query is executed and won’t preload
> anything from an underlying CacheStore. Automatic preloading works for
> key-value queries like cache.get(key).
>


This is an issue because I will potentially have to query TB of data. If I
use Spark thriftserver backed by IgniteRDD, does it solve this point and
can I get automatic preloading from C* ?

>
> - Spark thrift can manage multi tenancy: different users can connect to
> the same SQL engine and share cache. In Ignite it's one cache per user, so
> a big waste of RAM.
>
>
> Everyone can connect to an Ignite cluster and work with the same set of
> distributed caches. I’m not sure why you need to create caches with the
> same content for every user.
>

It's a security issue, Ignite cache doesn't provide multiple user account
per cache. I am thinking of using Spark to authenticate multiple users and
then Spark use a shared account on Ignite cache


>
> If you need a real multi-tenancy support where cacheA is allowed to be
> accessed by a group of users A only and cacheB by users from group B then
> you can take a look at GridGain which is built on top of Ignite
> https://gridgain.readme.io/docs/multi-tenancy
>
>
>
OK but I am evaluating open source only solutions (kylin, druid,
alluxio...), it's a constraint from my hierarchy

>
> What I want to achieve is :
> - use Cassandra for data store as it provides idempotence (HDFS/hive
> doesn't), resulting in exactly once semantic without any duplicates.
> - use Spark SQL thriftserver in multi tenancy for large scale adhoc
> analytics queries (> TB) from an ODBC driver through HTTP(S)
> - accelerate Cassandra reads when the data modeling of the Cassandra table
> doesn't fit the queries. Queries would be OLAP style: target multiple C*
> partitions, groupby or filters on lots of dimensions that aren't
> necessarely in the C* table key.
>
>
> As it was mentioned Ignite uses Cassandra as a CacheStore. You should keep
> this in mind. Before trying to assemble all the chain I would recommend you
> trying to connect Spark SQL thrift server directly to Ignite and work with
> its shared RDDs [1]. A shared RDD (basically Ignite cache) can be backed by
> Cassandra. Probably this chain will work for you but I can’t give more
> precise guidance on this.
>
>
I will try to make it works and give you feedback



> [1] https://apacheignite-fs.readme.io/docs/ignite-for-spark
>
> —
> Denis
>
> Thanks for your advises
>
>
> 2016-10-04 6:51 GMT+02:00 Jörn Franke <jornfra...@gmail.com>:
>
>> I am not sure that this will be performant. What do you want to achieve
>> here? Fast lookups? Then the Cassandra Ignite store might be the right
>> solution. If you want to do more analytic style of queries then you can put
>> the data on HDFS/Hive and use the Ignite HDFS cache to cache certain
>> partitions/tables in Hive in-memory. If you want to go to iterative machine
>> learning algorithms you can go for Spark on top of this. You can use then
>> also Ignite cache for Spark RDDs.
>>
>> On 4 Oct 2016, at 02:24, Alexey Kuznetsov <akuznet...@gridgain.com>
>> wrote:
>>
>> Hi, Vincent!
>>
>> Ignite also has SQL support (also scalable), I think it will be much
>> faster to query directly from Ignite than query from Spark.
>> Also please mind, that before executing queries you should load all
>> needed data to cache.
>> To load data from Cassand

Re: spark SQL thriftserver over ignite and cassandra

2016-10-04 Thread Denis Magda
Hi Vincent,

See my answers inline

> On Oct 4, 2016, at 12:54 AM, vincent gromakowski 
> <vincent.gromakow...@gmail.com> wrote:
> 
> Hi,
> I know that Ignite has SQL support but:
> - ODBC driver doesn't seem to provide HTTP(S) support, which is easier to 
> integrate on corporate networks with rules, firewalls, proxies

Igor Sapego, what URIs are supported presently? 

> - The SQL engine doesn't seem to scale like Spark SQL would. For instance, 
> Spark won't generate OOM is dataset (source or result) doesn't fit in memory. 
> From Ignite side, it's not clear…

OOM is not related to scalability topic at all. This is about application’s 
logic. 

Ignite SQL engine perfectly scales out along with your cluster. Moreover, 
Ignite supports indexes which allows you to get O(logN) running time complexity 
for your SQL queries while in case of Spark you will face with full-scans 
(O(N)) all the time.

However, to benefit from Ignite SQL queries you have to put all the data 
in-memory. Ignite doesn’t go to a CacheStore (Cassandra, relational database, 
MongoDB, etc) while a SQL query is executed and won’t preload anything from an 
underlying CacheStore. Automatic preloading works for key-value queries like 
cache.get(key).

> - Spark thrift can manage multi tenancy: different users can connect to the 
> same SQL engine and share cache. In Ignite it's one cache per user, so a big 
> waste of RAM.

Everyone can connect to an Ignite cluster and work with the same set of 
distributed caches. I’m not sure why you need to create caches with the same 
content for every user.

If you need a real multi-tenancy support where cacheA is allowed to be accessed 
by a group of users A only and cacheB by users from group B then you can take a 
look at GridGain which is built on top of Ignite
https://gridgain.readme.io/docs/multi-tenancy


> 
> What I want to achieve is :
> - use Cassandra for data store as it provides idempotence (HDFS/hive 
> doesn't), resulting in exactly once semantic without any duplicates. 
> - use Spark SQL thriftserver in multi tenancy for large scale adhoc analytics 
> queries (> TB) from an ODBC driver through HTTP(S) 
> - accelerate Cassandra reads when the data modeling of the Cassandra table 
> doesn't fit the queries. Queries would be OLAP style: target multiple C* 
> partitions, groupby or filters on lots of dimensions that aren't necessarely 
> in the C* table key.
> 

As it was mentioned Ignite uses Cassandra as a CacheStore. You should keep this 
in mind. Before trying to assemble all the chain I would recommend you trying 
to connect Spark SQL thrift server directly to Ignite and work with its shared 
RDDs [1]. A shared RDD (basically Ignite cache) can be backed by Cassandra. 
Probably this chain will work for you but I can’t give more precise guidance on 
this.

[1] https://apacheignite-fs.readme.io/docs/ignite-for-spark
 
—
Denis

> Thanks for your advises
> 
> 
> 2016-10-04 6:51 GMT+02:00 Jörn Franke <jornfra...@gmail.com 
> <mailto:jornfra...@gmail.com>>:
> I am not sure that this will be performant. What do you want to achieve here? 
> Fast lookups? Then the Cassandra Ignite store might be the right solution. If 
> you want to do more analytic style of queries then you can put the data on 
> HDFS/Hive and use the Ignite HDFS cache to cache certain partitions/tables in 
> Hive in-memory. If you want to go to iterative machine learning algorithms 
> you can go for Spark on top of this. You can use then also Ignite cache for 
> Spark RDDs.
> 
> On 4 Oct 2016, at 02:24, Alexey Kuznetsov <akuznet...@gridgain.com 
> <mailto:akuznet...@gridgain.com>> wrote:
> 
>> Hi, Vincent!
>> 
>> Ignite also has SQL support (also scalable), I think it will be much faster 
>> to query directly from Ignite than query from Spark.
>> Also please mind, that before executing queries you should load all needed 
>> data to cache.
>> To load data from Cassandra to Ignite you may use Cassandra store [1].
>> 
>> [1] https://apacheignite.readme.io/docs/ignite-with-apache-cassandra 
>> <https://apacheignite.readme.io/docs/ignite-with-apache-cassandra>
>> 
>> On Tue, Oct 4, 2016 at 4:19 AM, vincent gromakowski 
>> <vincent.gromakow...@gmail.com <mailto:vincent.gromakow...@gmail.com>> wrote:
>> Hi,
>> I am evaluating the possibility to use Spark SQL (and its scalability) over 
>> an Ignite cache with Cassandra persistent store to increase read workloads 
>> like OLAP style analytics.
>> Is there any way to configure Spark thriftserver to load an external table 
>> in Ignite like we can do in Cassandra ?
>> Here is an example of config for spark backed by cassandra
>> 
>> CREATE EXTERNAL TABLE MyHiveTable 
>

Re: spark SQL thriftserver over ignite and cassandra

2016-10-04 Thread vincent gromakowski
Do you have any remark/correction on my  assumptions ?

Le 4 oct. 2016 9:54 AM, "vincent gromakowski" <vincent.gromakow...@gmail.com>
a écrit :

> Hi,
> I know that Ignite has SQL support but:
> - ODBC driver doesn't seem to provide HTTP(S) support, which is easier to
> integrate on corporate networks with rules, firewalls, proxies
> - The SQL engine doesn't seem to scale like Spark SQL would. For instance,
> Spark won't generate OOM is dataset (source or result) doesn't fit in
> memory. From Ignite side, it's not clear...
> - Spark thrift can manage multi tenancy: different users can connect to
> the same SQL engine and share cache. In Ignite it's one cache per user, so
> a big waste of RAM.
>
> What I want to achieve is :
> - use Cassandra for data store as it provides idempotence (HDFS/hive
> doesn't), resulting in exactly once semantic without any duplicates.
> - use Spark SQL thriftserver in multi tenancy for large scale adhoc
> analytics queries (> TB) from an ODBC driver through HTTP(S)
> - accelerate Cassandra reads when the data modeling of the Cassandra table
> doesn't fit the queries. Queries would be OLAP style: target multiple C*
> partitions, groupby or filters on lots of dimensions that aren't
> necessarely in the C* table key.
>
> Thanks for your advises
>
>
> 2016-10-04 6:51 GMT+02:00 Jörn Franke <jornfra...@gmail.com>:
>
>> I am not sure that this will be performant. What do you want to achieve
>> here? Fast lookups? Then the Cassandra Ignite store might be the right
>> solution. If you want to do more analytic style of queries then you can put
>> the data on HDFS/Hive and use the Ignite HDFS cache to cache certain
>> partitions/tables in Hive in-memory. If you want to go to iterative machine
>> learning algorithms you can go for Spark on top of this. You can use then
>> also Ignite cache for Spark RDDs.
>>
>> On 4 Oct 2016, at 02:24, Alexey Kuznetsov <akuznet...@gridgain.com>
>> wrote:
>>
>> Hi, Vincent!
>>
>> Ignite also has SQL support (also scalable), I think it will be much
>> faster to query directly from Ignite than query from Spark.
>> Also please mind, that before executing queries you should load all
>> needed data to cache.
>> To load data from Cassandra to Ignite you may use Cassandra store [1].
>>
>> [1] https://apacheignite.readme.io/docs/ignite-with-apache-cassandra
>>
>> On Tue, Oct 4, 2016 at 4:19 AM, vincent gromakowski <
>> vincent.gromakow...@gmail.com> wrote:
>>
>>> Hi,
>>> I am evaluating the possibility to use Spark SQL (and its scalability)
>>> over an Ignite cache with Cassandra persistent store to increase read
>>> workloads like OLAP style analytics.
>>> Is there any way to configure Spark thriftserver to load an external
>>> table in Ignite like we can do in Cassandra ?
>>> Here is an example of config for spark backed by cassandra
>>>
>>> CREATE EXTERNAL TABLE MyHiveTable
>>> ( id int, data string )
>>> STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler'
>>>
>>> TBLPROPERTIES ("cassandra.host" = "x.x.x.x", "cassandra.ks.name"
>>> = "test" ,
>>>   "cassandra.cf.name" = "mytable" ,
>>>   "cassandra.ks.repfactor" = "1" ,
>>>   "cassandra.ks.strategy" =
>>> "org.apache.cassandra.locator.SimpleStrategy" );
>>>
>>>
>>
>>
>> --
>> Alexey Kuznetsov
>>
>>
>


Re: spark SQL thriftserver over ignite and cassandra

2016-10-04 Thread vincent gromakowski
Hi,
I know that Ignite has SQL support but:
- ODBC driver doesn't seem to provide HTTP(S) support, which is easier to
integrate on corporate networks with rules, firewalls, proxies
- The SQL engine doesn't seem to scale like Spark SQL would. For instance,
Spark won't generate OOM is dataset (source or result) doesn't fit in
memory. From Ignite side, it's not clear...
- Spark thrift can manage multi tenancy: different users can connect to the
same SQL engine and share cache. In Ignite it's one cache per user, so a
big waste of RAM.

What I want to achieve is :
- use Cassandra for data store as it provides idempotence (HDFS/hive
doesn't), resulting in exactly once semantic without any duplicates.
- use Spark SQL thriftserver in multi tenancy for large scale adhoc
analytics queries (> TB) from an ODBC driver through HTTP(S)
- accelerate Cassandra reads when the data modeling of the Cassandra table
doesn't fit the queries. Queries would be OLAP style: target multiple C*
partitions, groupby or filters on lots of dimensions that aren't
necessarely in the C* table key.

Thanks for your advises


2016-10-04 6:51 GMT+02:00 Jörn Franke <jornfra...@gmail.com>:

> I am not sure that this will be performant. What do you want to achieve
> here? Fast lookups? Then the Cassandra Ignite store might be the right
> solution. If you want to do more analytic style of queries then you can put
> the data on HDFS/Hive and use the Ignite HDFS cache to cache certain
> partitions/tables in Hive in-memory. If you want to go to iterative machine
> learning algorithms you can go for Spark on top of this. You can use then
> also Ignite cache for Spark RDDs.
>
> On 4 Oct 2016, at 02:24, Alexey Kuznetsov <akuznet...@gridgain.com> wrote:
>
> Hi, Vincent!
>
> Ignite also has SQL support (also scalable), I think it will be much
> faster to query directly from Ignite than query from Spark.
> Also please mind, that before executing queries you should load all needed
> data to cache.
> To load data from Cassandra to Ignite you may use Cassandra store [1].
>
> [1] https://apacheignite.readme.io/docs/ignite-with-apache-cassandra
>
> On Tue, Oct 4, 2016 at 4:19 AM, vincent gromakowski <
> vincent.gromakow...@gmail.com> wrote:
>
>> Hi,
>> I am evaluating the possibility to use Spark SQL (and its scalability)
>> over an Ignite cache with Cassandra persistent store to increase read
>> workloads like OLAP style analytics.
>> Is there any way to configure Spark thriftserver to load an external
>> table in Ignite like we can do in Cassandra ?
>> Here is an example of config for spark backed by cassandra
>>
>> CREATE EXTERNAL TABLE MyHiveTable
>> ( id int, data string )
>> STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler'
>>
>> TBLPROPERTIES ("cassandra.host" = "x.x.x.x", "cassandra.ks.name"
>> = "test" ,
>>   "cassandra.cf.name" = "mytable" ,
>>   "cassandra.ks.repfactor" = "1" ,
>>   "cassandra.ks.strategy" =
>> "org.apache.cassandra.locator.SimpleStrategy" );
>>
>>
>
>
> --
> Alexey Kuznetsov
>
>


Re: spark SQL thriftserver over ignite and cassandra

2016-10-03 Thread Jörn Franke
I am not sure that this will be performant. What do you want to achieve here? 
Fast lookups? Then the Cassandra Ignite store might be the right solution. If 
you want to do more analytic style of queries then you can put the data on 
HDFS/Hive and use the Ignite HDFS cache to cache certain partitions/tables in 
Hive in-memory. If you want to go to iterative machine learning algorithms you 
can go for Spark on top of this. You can use then also Ignite cache for Spark 
RDDs.

> On 4 Oct 2016, at 02:24, Alexey Kuznetsov <akuznet...@gridgain.com> wrote:
> 
> Hi, Vincent!
> 
> Ignite also has SQL support (also scalable), I think it will be much faster 
> to query directly from Ignite than query from Spark.
> Also please mind, that before executing queries you should load all needed 
> data to cache.
> To load data from Cassandra to Ignite you may use Cassandra store [1].
> 
> [1] https://apacheignite.readme.io/docs/ignite-with-apache-cassandra
> 
>> On Tue, Oct 4, 2016 at 4:19 AM, vincent gromakowski 
>> <vincent.gromakow...@gmail.com> wrote:
>> Hi,
>> I am evaluating the possibility to use Spark SQL (and its scalability) over 
>> an Ignite cache with Cassandra persistent store to increase read workloads 
>> like OLAP style analytics.
>> Is there any way to configure Spark thriftserver to load an external table 
>> in Ignite like we can do in Cassandra ?
>> Here is an example of config for spark backed by cassandra
>> 
>> CREATE EXTERNAL TABLE MyHiveTable 
>> ( id int, data string ) 
>> STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' 
>> TBLPROPERTIES ("cassandra.host" = "x.x.x.x", "cassandra.ks.name" = 
>> "test" , 
>>   "cassandra.cf.name" = "mytable" , 
>>   "cassandra.ks.repfactor" = "1" , 
>>   "cassandra.ks.strategy" = 
>> "org.apache.cassandra.locator.SimpleStrategy" ); 
>> 
> 
> 
> 
> -- 
> Alexey Kuznetsov


Re: spark SQL thriftserver over ignite and cassandra

2016-10-03 Thread Alexey Kuznetsov
Hi, Vincent!

Ignite also has SQL support (also scalable), I think it will be much faster
to query directly from Ignite than query from Spark.
Also please mind, that before executing queries you should load all needed
data to cache.
To load data from Cassandra to Ignite you may use Cassandra store [1].

[1] https://apacheignite.readme.io/docs/ignite-with-apache-cassandra

On Tue, Oct 4, 2016 at 4:19 AM, vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> Hi,
> I am evaluating the possibility to use Spark SQL (and its scalability)
> over an Ignite cache with Cassandra persistent store to increase read
> workloads like OLAP style analytics.
> Is there any way to configure Spark thriftserver to load an external table
> in Ignite like we can do in Cassandra ?
> Here is an example of config for spark backed by cassandra
>
> CREATE EXTERNAL TABLE MyHiveTable
> ( id int, data string )
> STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler'
>
> TBLPROPERTIES ("cassandra.host" = "x.x.x.x", "cassandra.ks.name"
> = "test" ,
>   "cassandra.cf.name" = "mytable" ,
>   "cassandra.ks.repfactor" = "1" ,
>   "cassandra.ks.strategy" =
> "org.apache.cassandra.locator.SimpleStrategy" );
>
>


-- 
Alexey Kuznetsov


spark SQL thriftserver over ignite and cassandra

2016-10-03 Thread vincent gromakowski
Hi,
I am evaluating the possibility to use Spark SQL (and its scalability) over
an Ignite cache with Cassandra persistent store to increase read workloads
like OLAP style analytics.
Is there any way to configure Spark thriftserver to load an external table
in Ignite like we can do in Cassandra ?
Here is an example of config for spark backed by cassandra

CREATE EXTERNAL TABLE MyHiveTable
( id int, data string )
STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler'
TBLPROPERTIES ("cassandra.host" = "x.x.x.x", "cassandra.ks.name" =
"test" ,
  "cassandra.cf.name" = "mytable" ,
  "cassandra.ks.repfactor" = "1" ,
  "cassandra.ks.strategy" =
"org.apache.cassandra.locator.SimpleStrategy" );


Re: Ignite with Cassandra

2016-08-29 Thread Igor Rudyak
The reason is in *"id"* field. According to the persistence descriptor, cache
key will be stored in "id" field, but at the same time User POJO class also
has such field. There are several options to fix this:

1) Specify another column mapping for Ignite cache key. For example:
**

2) Specify non default column mapping for "id" field in User class. Here are
the options to do this:

   a) Mark "id" field by *@QuerySqlField* annotation and specify name which
is differ than "id". For example:  

   *@QuerySqlField(name="userId")*

   b) Manually specify columns mapping for User class in xml persistence
descriptor and make sure that "id" field is mapped to something differ that
"id". For example:
*


*

3) Manually specify columns mapping for User class in xml persistence
descriptor and omit "id" field - such a way "id" field from User class
simply will not be persisted into Cassandra table. Which makes sense if you
already have absolutely the same value for Ignite cache key - you don't need
to save the same value twice into two different columns. Example:   
*
    
*



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-tp7242p7395.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite with Cassandra

2016-08-25 Thread Igor Rudyak
Could you please provide full definition of key and value classes and xml
persistence descriptor you are using for Ignite cache?



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-tp7242p7331.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite with Cassandra

2016-08-25 Thread vkulichenko
Hi,

Please properly subscribe to the mailing list so that the community can
receive email notifications for your messages. To subscribe, send empty
email to user-subscr...@ignite.apache.org and follow simple instructions in
the reply.


yvladimirov wrote
> Hello guys!
> Can you help me?
> I want used Apache Ignite with Cassandra.
> 
> My Pojo without Ignite.
> 
> @Table(name = "user")
> public class User {
> @PartitionKey
> private UUID id;
> private String name;
> 
> }
> 
> It's work
> 
> But if I used Ignite
>   Caused by: class org.apache.ignite.IgniteException: Failed to 
> prepare
> Cassandra CQL statement: insert into test.user (id, name, id) values
> (?,?,?);
>   at
> org.apache.ignite.cache.store.cassandra.session.CassandraSessionImpl.prepareStatement(CassandraSessionImpl.java:515)
>   at
> org.apache.ignite.cache.store.cassandra.session.CassandraSessionImpl.execute(CassandraSessionImpl.java:132)
>   ... 25 more
> 
> 
> How to fix duplication ID?

What do you use as a key?

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-tp7242p7330.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite with Cassandra and SSL

2016-06-28 Thread Denis Magda
Hi,

This is a duplicate discussion of the following
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-td5610.html#a5700

You will find a solution in the discussion above.

--
Denis



Good morning

Could you please help me understand how to establish persistence to
Cassandrea via SSL?

What else do I need to ensure apart from setting the below flag to true
useSSL  false   Enables the use of SSL




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5611p5952.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Ignite with Cassandra and SSL

2016-06-24 Thread Denis Magda
Hi,

First of all you need to create an SSLContext object and use it to initialize 
SSLOptions. Here is a good example [1] provided by Cassandra community.

After that pass this SSLOptions object into Ignite's Cassandra DataSource 
object using 
DataSource.setSslOptions method.

[1] 
https://github.com/datastax/java-driver/blob/2.1.6/driver-core/src/test/java/com/datastax/driver/core/SSLTestBase.java#L84-L114
 
<https://github.com/datastax/java-driver/blob/2.1.6/driver-core/src/test/java/com/datastax/driver/core/SSLTestBase.java#L84-L114>

—
Denis

> On Jun 23, 2016, at 6:47 PM, ChickyDutt <ash.dutt.sha...@gmail.com> wrote:
> 
> Attached is my connection-setting.xml file. I have enabled the useSSL 
> property to true. Can you show me the way to refer the Cassandra node 
> keystore file and the password through SSLOptions and then include it in the 
> attached file. 
> 
> Please let me know if you need any further information.
> 
> Regards.
> Chicky
> 
> On Thu, Jun 23, 2016 at 4:34 PM, Ashish Dutt Sharma <[hidden email] 
> > wrote:
> A gentle reminder. Could you please help me out on this?
> 
> How do I pass the Keystore and the password in SSLOptions in 
> Cassandra.DataSource?
> 
> Regards,
> Ashish Sharma
> 
> On Fri, Jun 17, 2016 at 12:53 AM, Denis Magda [via Apache Ignite Users] 
> <[hidden email] > 
> wrote:
> Igor R., Alexey K. or Val,
> 
> Is SSL presently supported for Cassandra cache store?
> 
> —
> Denis
> 
>> On Jun 13, 2016, at 11:34 AM, ChickyDutt <[hidden email] 
>> <http://user/SendEmail.jtp?type=node=5700=0>> wrote:
>> 
>> Good morning
>> 
>> Could you please help me understand how to establish persistence to 
>> Cassandrea via SSL?
>> 
>> What else do I need to ensure apart from setting the below flag to true
>> useSSL   false   Enables the use of SSL
>> 
>> View this message in context: Ignite with Cassandra and SSL 
>> <http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5611.html>
>> Sent from the Apache Ignite Users mailing list archive 
>> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com 
>> <http://nabble.com/>.
> 
> 
> 
> If you reply to this email, your message will be added to the discussion 
> below:
> http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5610p5700.html
>  
> <http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5610p5700.html>
> To start a new topic under Apache Ignite Users, email [hidden email] 
>  
> To unsubscribe from Apache Ignite Users, click here 
> .
> NAML 
> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
> 
> 
>  connection-settings.xml (2K) Download Attachment 
> <http://apache-ignite-users.70518.x6.nabble.com/attachment/5847/0/connection-settings.xml>
> View this message in context: Re: Ignite with Cassandra and SSL 
> <http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5610p5847.html>
> Sent from the Apache Ignite Users mailing list archive 
> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.



Re: Ignite with Cassandra and SSL

2016-06-23 Thread ChickyDutt
Attached is my connection-setting.xml file. I have enabled the useSSL
property to true. Can you show me the way to refer the Cassandra node
keystore file and the password through SSLOptions and then include it in
the attached file.

Please let me know if you need any further information.

Regards.
Chicky

On Thu, Jun 23, 2016 at 4:34 PM, Ashish Dutt Sharma <
ash.dutt.sha...@gmail.com> wrote:

> A gentle reminder. Could you please help me out on this?
>
> How do I pass the Keystore and the password in SSLOptions in
> Cassandra.DataSource?
>
> Regards,
> Ashish Sharma
>
> On Fri, Jun 17, 2016 at 12:53 AM, Denis Magda [via Apache Ignite Users] <
> ml-node+s70518n5700...@n6.nabble.com> wrote:
>
>> Igor R., Alexey K. or Val,
>>
>> Is SSL presently supported for Cassandra cache store?
>>
>> —
>> Denis
>>
>> On Jun 13, 2016, at 11:34 AM, ChickyDutt <[hidden email]
>> <http:///user/SendEmail.jtp?type=node=5700=0>> wrote:
>>
>> Good morning
>>
>> Could you please help me understand how to establish persistence to
>> Cassandrea via SSL?
>>
>> What else do I need to ensure apart from setting the below flag to true
>> *useSSL* false Enables the use of SSL
>>
>> --
>> View this message in context: Ignite with Cassandra and SSL
>> <http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5611.html>
>> Sent from the Apache Ignite Users mailing list archive
>> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com
>> <http://nabble.com>.
>>
>>
>>
>>
>> --
>> If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5610p5700.html
>> To start a new topic under Apache Ignite Users, email
>> ml-node+s70518n...@n6.nabble.com
>> To unsubscribe from Apache Ignite Users, click here
>> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code=1=YXNoLmR1dHQuc2hhcm1hQGdtYWlsLmNvbXwxfC0xOTcwMDkyNjky>
>> .
>> NAML
>> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>


connection-settings.xml (2K) 
<http://apache-ignite-users.70518.x6.nabble.com/attachment/5847/0/connection-settings.xml>




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5610p5847.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Ignite with Cassandra and SSL

2016-06-23 Thread ChickyDutt
A gentle reminder. Could you please help me out on this?

How do I pass the Keystore and the password in SSLOptions in
Cassandra.DataSource?

Regards,
Ashish Sharma

On Fri, Jun 17, 2016 at 12:53 AM, Denis Magda [via Apache Ignite Users] <
ml-node+s70518n5700...@n6.nabble.com> wrote:

> Igor R., Alexey K. or Val,
>
> Is SSL presently supported for Cassandra cache store?
>
> —
> Denis
>
> On Jun 13, 2016, at 11:34 AM, ChickyDutt <[hidden email]
> <http:///user/SendEmail.jtp?type=node=5700=0>> wrote:
>
> Good morning
>
> Could you please help me understand how to establish persistence to
> Cassandrea via SSL?
>
> What else do I need to ensure apart from setting the below flag to true
> *useSSL* false Enables the use of SSL
>
> --
> View this message in context: Ignite with Cassandra and SSL
> <http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5611.html>
> Sent from the Apache Ignite Users mailing list archive
> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com
> <http://nabble.com>.
>
>
>
>
> --
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5610p5700.html
> To start a new topic under Apache Ignite Users, email
> ml-node+s70518n...@n6.nabble.com
> To unsubscribe from Apache Ignite Users, click here
> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code=1=YXNoLmR1dHQuc2hhcm1hQGdtYWlsLmNvbXwxfC0xOTcwMDkyNjky>
> .
> NAML
> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5610p5843.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Ignite with Cassandra and SSL

2016-06-16 Thread Denis Magda
Igor R., Alexey K. or Val,

Is SSL presently supported for Cassandra cache store?

—
Denis

> On Jun 13, 2016, at 11:34 AM, ChickyDutt <ash.dutt.sha...@gmail.com> wrote:
> 
> Good morning
> 
> Could you please help me understand how to establish persistence to 
> Cassandrea via SSL?
> 
> What else do I need to ensure apart from setting the below flag to true
> useSSLfalse   Enables the use of SSL
> 
> View this message in context: Ignite with Cassandra and SSL 
> <http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5611.html>
> Sent from the Apache Ignite Users mailing list archive 
> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.



Ignite with Cassandra and SSL

2016-06-13 Thread ChickyDutt
Good morning

Could you please help me understand how to establish persistence to
Cassandrea via SSL?

What else do I need to ensure apart from setting the below flag to true
*useSSL* false Enables the use of SSL




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5611.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.