Re: Issues running Ignite with Cassandra and spark.
I fixed the issue with dataframe api and am getting all columns now. However, I am not able to perform grouping + udaf operations as it tries to perform these on ignite. setting OPTION_DISABLE_SPARK_SQL_OPTIMIZATION = true is not helping. How so we tell ignite to just fetch data and perform all other operations in spark?
Re: Issues running Ignite with Cassandra and spark.
Hi, Thanks for the answer. Unfortunately, we cannot remove Cassandra as it is being used elsewhere as well. We will have to write directly in ignite and sync with cassandra. We had a few other issues while getting data from spark: 1) cacherdd.sql("select * from table") is giving me heap memory (GC) issues. However, getting data using spark.read.format() works fine. Why is this so ? 2) in my persistence, i have IndexedTypes with key and value POJO classes. The key class corresponds to the key in cassandra with partition and clustering keys defined. While querying with sql, (select * from value_class) i get all the columns of the table. However, while querying using spark.read.format(...).option(OPTION_TABLE,value_class).load() , I only get the columns stored in the value class. How do i fetch all the columns using dataframe api ? Thanks, Shrey On Fri, 28 Sep 2018, 08:43 Alexey Kuznetsov, wrote: > Hi, Shrey! > > Just as idea - Ignite now has persistence (see > https://apacheignite.readme.io/docs/distributed-persistent-store), > may be you can completely replace Cassandra with Ignite? > > In this case all data always be actual, no need to sync with external db. > > -- > Alexey Kuznetsov >
Re: Issues running Ignite with Cassandra and spark.
Hi, Shrey! Just as idea - Ignite now has persistence (see https://apacheignite.readme.io/docs/distributed-persistent-store), may be you can completely replace Cassandra with Ignite? In this case all data always be actual, no need to sync with external db. -- Alexey Kuznetsov
Re: Issues running Ignite with Cassandra and spark.
Hello! 1) There is no generic way of pulling updates from 3rd party database and there is no API support for it usually, so it's not obvious how we could implement that even if we wanted. 2) By default cache store will process data in parallel on all nodes. However if will not align data distribution with that of cassandra, and I would say that implementing it will be infeasible. However, you could try to see if there are ways to speed up loadCache by tuning Ignite and-or cache configurations. Regards, -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Issues running Ignite with Cassandra and spark.
Hi, we are using Ignite as a cache layer over Cassandra for faster read queries using spark. Our cluster has 10 nodes running an instance of Cassandra and Ignite. However, we came across a few issues: 1) We currently store the data from spark to cassandra. Hence to load data, we need to call .loadCache() . I know there are ways for data written in Ignite to be synced with cassandra (writeBehind, writeThroughs) . However we want to do the opposite. Load in cassandra and want it to be reflected in the cache which can be queries by spark. Is there a way to do so ? 2) To load data into the cache from Cassandra, I start a new client in another machine and call the .loadCache() method. However, it takes almost 45 minutes to load the data (around 30 million rows with 20 columns each) . Is there a way to make this faster by ensuring that data from a particular node in cassandra cluster is parallelly loaded to the cache instance of the same node ? I have defined my partition and clustering columns in the my spring persistance-settings. Thanks, Shrey -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Re: Ignite to Cassandra change from 1.9 to 2.0
t io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) at java.lang.Thread.run(Thread.java:745) Disconnected from the target VM, address: '127.0.0.1:39960', transport: 'socket' Process finished with exit code 1 HistoryResult Class public class HistoryResult { @QuerySqlField private String key; @QuerySqlField(name = "session_id") private String sessionId; @QuerySqlField(name = "session_time") private Date sessionTime; @QuerySqlField(name = "algorithm_name") private String algorithmName; @QueryTextField private String results; @QuerySqlField(name = "analysis_time") private Date analysisTime; @QuerySqlField(name = "created_dt") private Date createdDate; @QuerySqlField(name = "created_by") private String createdBy; @QuerySqlField(name = "modified_dt") private Date modifiedDate; @QuerySqlField(name = "modified_by") private String modifiedBy; ... HistoryResultKey Class public class HistoryResultKey { @AffinityKeyMapped @QuerySqlField(index = true, groups = { "historyResultPK" }) private String key; @QuerySqlField(index = true, groups = { "historyResultPK" }, name = "session_id") private String sessionId; @QuerySqlField(index = true, groups = { "historyResultPK" }, name = "algorithm_name") private String algorithmName; ... Persistence Settings [DOES NOT WORK] comment = 'Test table for Ignite/Cassandra connection' AND read_repair_chance = 0.2 Persistence Settings [DOES WORK] comment = 'Test table for Ignite/Cassandra connection' AND read_repair_chance = 0.2 -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-to-Cassandra-change-from-1-9-to-2-0-tp13099p13422.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite to Cassandra change from 1.9 to 2.0
First, I do have setters on the class, just didn't include them for space reasons. Second, this is the exact opposite from what I was told originally back January for v1.8. So, this looks like it was a breaking change introduced in v2.0. I haven't checked the latest "what's new" recently. Was there a mention in there about this change? Previous responses to my questions back in January: === Igor Rudyakonline Igor Rudyak Jan 04, 2017; 8:41pm Re: Ignite with Cassandra questions / errors Ok, I took a look at the HistoryResult implementation once again and found the reason. Your java class should follow JavaBeans Conventions. The most important here is that you class should implement getters/setters methods for READ/WRITE properties or getters for READ-ONLY properties. In your case you just have a class with private members annotated by @QuerySqlField - this will not work. You should implement getters/setters methods for these private fields and then annotate getter or setter with @QuerySqlField === Igor Rudyakonline Igor Rudyak Jan 05, 2017; 11:51am Re: Ignite with Cassandra questions / errors Hi Kenan, You missed the main point - getters or setters of your custom classes should be annotated with @QuerySqlField instead of class private members. Here is a slightly modified version of your custom classes which should work: -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-to-Cassandra-change-from-1-9-to-2-0-tp13099p13247.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite to Cassandra change from 1.9 to 2.0
The second problem is that your *HistoryResultKey* doesn't have setters. It will not work without setters - your POJO classes should follow Java Beans convention. -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-to-Cassandra-change-from-1-9-to-2-0-tp13099p13161.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite to Cassandra change from 1.9 to 2.0
First of all - annotations *@QuerySqlField* and *@QueryTextField* are no longer supported for the methods. Because of this simplified persistence descriptor doesn't work as expected. -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-to-Cassandra-change-from-1-9-to-2-0-tp13099p13160.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite to Cassandra change from 1.9 to 2.0
I upgraded from Ignite 1.9 to Ignite 2.0 and this started happening. The stacktrace is below. Also, I'm including the .sh script that I'm running so you can see that the only difference between the executions is pointing to v1.9 versus pointing to v2.0. My code didn't change, but it fails in 2.0 and works in 1.9. You'll find the "main" execution code at the bottom. .sh #export IGNITE_HOME=/app/ignite/apache-ignite-fabric-1.9.0-bin export IGNITE_HOME=/app/ignite/apache-ignite-fabric-2.0.0-bin export IGNITE_HOME_LIBS=${IGNITE_HOME}/libs export IGNITE_LIBS=${IGNITE_HOME_LIBS}/*:${IGNITE_HOME_LIBS}/ignite-spring/*:${IGNITE_HOME_LIBS}/ignite-indexing/*:${IGNITE_HOME_LIBS}/ignite-cassandra-store/*:${IGNITE_HOME_LIBS}/ignite-cassandra-serializers/* java -cp iat2Kafka-0.0.1.jar:conf:libs/*:${IGNITE_LIBS} com.test.TestCassandraPersistence & Execution/StackTrace [user1@host001 app1]$ ./run.sh [user1@host001 app1]$ [09:43:25]__ [09:43:25] / _/ ___/ |/ / _/_ __/ __/ [09:43:25] _/ // (7 7// / / / / _/ [09:43:25] /___/\___/_/|_/___/ /_/ /___/ [09:43:25] [09:43:25] ver. 2.0.0#20170430-sha1:d4eef3c6 [09:43:25] 2017 Copyright(C) Apache Software Foundation [09:43:25] [09:43:25] Ignite documentation: http://ignite.apache.org [09:43:25] [09:43:25] Quiet mode. [09:43:25] ^-- Logging to file '/app/ignite/apache-ignite-fabric-2.0.0-bin/work/log/ignite-de853f44.0.log' [09:43:25] ^-- To see **FULL** console log here add -DIGNITE_QUIET=false or "-v" to ignite.{sh|bat} [09:43:25] [09:43:25] OS: Linux 2.6.32-696.el6.x86_64 amd64 [09:43:25] VM information: Java(TM) SE Runtime Environment 1.8.0_131-b11 Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.131-b11 [09:43:25] Configured plugins: [09:43:25] ^-- None [09:43:25] [09:43:25] Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides. [09:43:25] Security status [authentication=off, tls/ssl=off] log4j:WARN No appenders could be found for logger (org.springframework.beans.factory.support.DefaultListableBeanFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. [09:43:27] Performance suggestions for grid (fix if possible) [09:43:27] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true [09:43:27] ^-- Enable G1 Garbage Collector (add '-XX:+UseG1GC' to JVM options) [09:43:27] ^-- Specify JVM heap max size (add '-Xmx[g|G|m|M|k|K]' to JVM options) [09:43:27] ^-- Set max direct memory size if getting 'OOME: Direct buffer memory' (add '-XX:MaxDirectMemorySize=[g|G|m|M|k|K]' to JVM options) [09:43:27] ^-- Disable processing of calls to System.gc() (add '-XX:+DisableExplicitGC' to JVM options) [09:43:27] ^-- Speed up flushing of dirty pages by OS (alter vm.dirty_expire_centisecs parameter by setting to 500) [09:43:27] ^-- Reduce pages swapping ratio (set vm.swappiness=10) [09:43:27] ^-- Avoid direct reclaim and page allocation failures (set vm.extra_free_kbytes=124) [09:43:27] ^-- Enable write-behind to persistent store (set 'writeBehindEnabled' to true) [09:43:27] Refer to this page for more performance suggestions: https://apacheignite.readme.io/docs/jvm-and-system-tuning [09:43:27] [09:43:27] To start Console Management & Monitoring run ignitevisorcmd.{sh|bat} [09:43:27] [09:43:27] Ignite node started OK (id=de853f44) [09:43:27] Topology snapshot [ver=1, servers=1, clients=0, CPUs=144, heap=27.0GB] >>> Cache store example started. >>> Putting to C*. Key: [HistoryResultKey = [key: key1, sessionId: >>> sessionId1, algorithmName: algoName2]], Result: [HistoryResult = [key: >>> key1, sessionId: sessionId1, sessionTime: 2017-05-24 09:43:27.292, >>> algorithmName: algoName2, results: results-2017-05-24T09:43:27.298, >>> analysisTime: 2017-05-24 09:43:27.298, createdDate: 2017-05-24 >>> 09:43:27.298, createdBy: creator, modifiedDate: 2017-05-24 09:43:27.298, >>> modifiedBy: updater]] [09:43:38,214][SEVERE][main][CassandraCacheStore] Failed to execute Cassandra CQL statement: insert into "dev_qlty"."HistoryResult" ("algorithmname", "sessionid", "key", "analysistime", "createdby", "createddate", "modifiedby", "modifieddate", "results", "sessiontime") values (?,?,?,?,?,?,?,?,?,?) using ttl 2592000; class org.apache.ignite.IgniteException: Failed to execute Cassandra CQL statement: insert into "dev_qlty"."HistoryResult" ("algorithmname", "sessionid", "key", "analysistime", "createdby", "createddate", "modifiedby", &quo
Re: Ignite to Cassandra change from 1.9 to 2.0
Could you please provide full exception stack trace? What do you mean by upgraded to version 2.0? Is it Cassandra or Ignite version or something else? -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-to-Cassandra-change-from-1-9-to-2-0-tp13099p13108.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Ignite to Cassandra change from 1.9 to 2.0
I've just upgraded to the new version 2.0 from 1.8/1.9 with a Cassandra .xml configuration as the backing data-store. However, v2.0 is failing to pull the column name that's tied to the field name in the persistence definition. Because of that, the CQL being generated is trying to call Cassandra using the field name (e.g. sessionId) instead of the column name (e.g. session_id). I'm including my persistence.xml configurations that I've tried: one using a pure POJO strategy definition and one using a POJO strategy with the definition spelled out in the config file. Neither worked. Error [13:20:54,721][SEVERE][main][CassandraCacheStore] Failed to execute Cassandra CQL statement: insert into "dev_keyspace"."HistoryResult" ("algorithmname", "sessionid", "key", "analysistime", "createdby", "createddate", "modifiedby", "modifieddate", "results", "sessiontime") values (?,?,?,?,?,?,?,?,?,?) using ttl 2592000; class org.apache.ignite.IgniteException: Failed to execute Cassandra CQL statement: insert into "dev_keyspace"."HistoryResult" ("algorithmname", "sessionid", "key", "analysistime", "createdby", "createddate", "modifiedby", "modifieddate", "results", "sessiontime") values (?,?,?,?,?,?,?,?,?,?) using ttl 2592000; Pure POJO Strategy Config comment = 'Test table for Ignite/Cassandra connection' AND read_repair_chance = 0.2 POJO Strategy Full Definition Config comment = 'Test table for Ignite/Cassandra connection' AND read_repair_chance = 0.2 HistoryResultKey public class HistoryResultKey { private String key; private String sessionId; private String algorithmName; public HistoryResultKey() { // No op. } public HistoryResultKey(final String key, final String sessionId, final String algorithmName) { this.key = key; this.sessionId = sessionId; this.algorithmName = algorithmName; } @AffinityKeyMapped @QuerySqlField(index = true, groups = { "historyResultPK" }) public String getKey() { return this.key; } @QuerySqlField(index = true, groups = { "historyResultPK" }, name = "session_id") public String getSessionId() { return this.sessionId; } @QuerySqlField(index = true, groups = { "historyResultPK" }, name = "algorithm_name") public String getAlgorithmName() { return this.algorithmName; } } HistoryResult public class HistoryResult { private String key; private String sessionId; private Date sessionTime; private String algorithmName; private String results; private Date analysisTime; private Date createdDate; private String createdBy; private Date modifiedDate; private String modifiedBy; public MatlabHistoryResult() { // no op } public MatlabHistoryResult(final String key, final String sessionId, final Date sessionTime, final String algorithmName, final String results, final Date analysisTime, final Date createdDate, final String createdBy, final Date modifiedDate, final String modifiedBy) { this.key = key; this.sessionId = sessionId; this.sessionTime = sessionTime; this.algorithmName = algorithmName; this.results = results; this.analysisTime = analysisTime; this.createdDate = createdDate; this.createdBy = createdBy; this.modifiedDate = modifiedDate; this.modifiedBy = modifiedBy; } @QuerySqlField public String getKey() { return this.key; } @QuerySqlField(name = "session_id") public String getSessionId() { return this.sessionId; } @QuerySqlField(name = "session_time") public Date getSessionTime() { return this.sessionTime; } @QuerySqlField(name = "algorithm_name") public String getAlgorithmName() { return this.algorithmName; } @QueryTextField public String getRe
Re: Newbie: Questions on Ignite over cassandra
I meant that Cassandra itself will be involved only when you load the data into caches, which is a separate step that should happen prior to query execution. When Ignite query is executed, Cassandra is not touched. The answer on your question is yes - any joins are possible, similar to any relational database. However, for good performance you should consider collocation and indexing. See the documentation for details: https://apacheignite.readme.io/docs/sql-grid -Val -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Newbie-Questions-on-Ignite-over-cassandra-tp10264p10274.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Newbie: Questions on Ignite over cassandra
Hi, Please properly subscribe to the mailing list so that the community can receive email notifications for your messages. To subscribe, send empty email to user-subscr...@ignite.apache.org and follow simple instructions in the reply. Jenny B. wrote > I am exploring Apache Ignite on top of Cassandra as a possible tool to be > able to give ad-hoc queries on cassandra tables. Using Ignite is it > possible to able to search or query on any column in the underlying > cassandra tables, like a RDBMS? Or can the join columns and search columns > only be partition and clustering columns ? > > If using Ignite, is there still need to create indexes on cassandra ? Also > how does ignite treat materialized views ? Will there be a need to create > materialized views ? > > Also any insights into how updates to cassandra release can/will be > handled by Ignite would be very helpful. If you execute in-memory SQL queries using Ignite API, you have to load the data from the store first: https://apacheignite.readme.io/docs/data-loading#ignitecacheloadcache Read-through works only for key based access. With queries you don't know set of required keys in advance, thus only data which is already in memory is used to execute them. -Val -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Newbie-Questions-on-Ignite-over-cassandra-tp10264p10268.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite with Cassandra questions / errors
Ah, now I understand. That worked. Thanks! -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-questions-errors-tp9607p9913.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite with Cassandra questions / errors
@QuerySqlField(name="analysis_time") public Date getAnalysisTime() { return analysisTime; } public void setAnalysisTime(Date analysisTime) { this.analysisTime = analysisTime; } @QuerySqlField(name="created_dt") public Date getCreatedDate() { return createdDate; } public void setCreatedDate(Date createdDate) { this.createdDate = createdDate; } @QuerySqlField(name="created_by") public String getCreatedBy() { return createdBy; } public void setCreatedBy(String createdBy) { this.createdBy = createdBy; } @QuerySqlField(name="modified_dt") public Date getModifiedDate() { return modifiedDate; } public void setModifiedDate(Date modifiedDate) { this.modifiedDate = modifiedDate; } @QuerySqlField(name="modified_by") public String getModifiedBy() { return modifiedBy; } public void setModifiedBy(String modifiedBy) { this.modifiedBy = modifiedBy; } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null) return false; if (!(o instanceof HistoryResult)) return false; HistoryResult that = (HistoryResult)o; if (vin != null ? !vin.equals(that.vin) : that.vin != null) return false; if (sessionId != null ? !sessionId.equals(that.sessionId) : that.sessionId != null) return false; if (sessionTime != null ? !sessionTime.equals(that.sessionTime) : that.sessionTime != null) return false; if (histName!= null ? !histName.equals(that.histName) : that.histName!= null) return false; if (results != null ? !results.equals(that.results) : that.results != null) return false; if (analysisTime != null ? !analysisTime.equals(that.analysisTime) : that.analysisTime != null) return false; if (createdDate != null ? !createdDate.equals(that.createdDate) : that.createdDate != null) return false; if (createdBy != null ? !createdBy.equals(that.createdBy) : that.createdBy != null) return false; if (modifiedDate != null ? !modifiedDate.equals(that.modifiedDate) : that.modifiedDate != null) return false; if (modifiedBy != null ? !modifiedBy.equals(that.modifiedBy) : that.modifiedBy != null) return false; return true; } @Override public int hashCode() { int res = vin != null ? vin.hashCode() : 0; res = 31 * res + (sessionId != null ? sessionId.hashCode() : 0); res = 31 * res + (sessionTime != null ? sessionTime.hashCode() : 0); res = 31 * res + (histName!= null ? histName.hashCode() : 0); res = 31 * res + (results != null ? results.hashCode() : 0); res = 31 * res + (analysisTime != null ? analysisTime.hashCode() : 0); res = 31 * res + (createdDate != null ? createdDate.hashCode() : 0); res = 31 * res + (createdBy != null ? createdBy.hashCode() : 0); res = 31 * res + (modifiedDate != null ? modifiedDate.hashCode() : 0); res = 31 * res + (modifiedBy != null ? modifiedBy.hashCode() : 0); return res; } @Override public String toString() { StringBuilder sb = new StringBuilder(); sb.append("HistoryResult = ["); sb.append("vin: "); sb.append(vin); sb.append(", sessionId: "); sb.append(sessionId); sb.append(", sessionTime: "); sb.append(sessionTime); sb.append(", histName: "); sb.append(histName); sb.append(", results: "); sb.append(results); sb.append(", analysisTime: "); sb.append(analysisTime); sb.append(", createdDate: "); sb.append(createdDate); sb.append(", createdBy: "); sb.append(createdBy); sb.append(", modifiedDate: "); sb.append(modifiedDate); sb.append(", modifiedBy: "); sb.append(modifiedBy); sb.append("]"); return sb.toString(); } } For these two classes you Cassandra persistence descriptor could be as simple as: * comment = 'Test table for Ignite/Cassandra connection' AND read_repair_chance = 0.2 * -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-questions-errors-tp9607p9909.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite with Cassandra questions / errors
public String getHistName() { return histName; } public void setHistName(String histName) { this.histName= histName; } public String getResults() { return results; } public void setResults(String results) { this.results = results; } public Date getAnalysisTime() { return analysisTime; } public void setAnalysisTime(Date analysisTime) { this.analysisTime = analysisTime; } public Date getCreatedDate() { return createdDate; } public void setCreatedDate(Date createdDate) { this.createdDate = createdDate; } public String getCreatedBy() { return createdBy; } public void setCreatedBy(String createdBy) { this.createdBy = createdBy; } public Date getModifiedDate() { return modifiedDate; } public void setModifiedDate(Date modifiedDate) { this.modifiedDate = modifiedDate; } public String getModifiedBy() { return modifiedBy; } public void setModifiedBy(String modifiedBy) { this.modifiedBy = modifiedBy; } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null) return false; if (!(o instanceof HistoryResult)) return false; HistoryResult that = (HistoryResult)o; if (vin != null ? !vin.equals(that.vin) : that.vin != null) return false; if (sessionId != null ? !sessionId.equals(that.sessionId) : that.sessionId != null) return false; if (sessionTime != null ? !sessionTime.equals(that.sessionTime) : that.sessionTime != null) return false; if (histName!= null ? !histName.equals(that.histName) : that.histName!= null) return false; if (results != null ? !results.equals(that.results) : that.results != null) return false; if (analysisTime != null ? !analysisTime.equals(that.analysisTime) : that.analysisTime != null) return false; if (createdDate != null ? !createdDate.equals(that.createdDate) : that.createdDate != null) return false; if (createdBy != null ? !createdBy.equals(that.createdBy) : that.createdBy != null) return false; if (modifiedDate != null ? !modifiedDate.equals(that.modifiedDate) : that.modifiedDate != null) return false; if (modifiedBy != null ? !modifiedBy.equals(that.modifiedBy) : that.modifiedBy != null) return false; return true; } @Override public int hashCode() { int res = vin != null ? vin.hashCode() : 0; res = 31 * res + (sessionId != null ? sessionId.hashCode() : 0); res = 31 * res + (sessionTime != null ? sessionTime.hashCode() : 0); res = 31 * res + (histName!= null ? histName.hashCode() : 0); res = 31 * res + (results != null ? results.hashCode() : 0); res = 31 * res + (analysisTime != null ? analysisTime.hashCode() : 0); res = 31 * res + (createdDate != null ? createdDate.hashCode() : 0); res = 31 * res + (createdBy != null ? createdBy.hashCode() : 0); res = 31 * res + (modifiedDate != null ? modifiedDate.hashCode() : 0); res = 31 * res + (modifiedBy != null ? modifiedBy.hashCode() : 0); return res; } @Override public String toString() { StringBuilder sb = new StringBuilder(); sb.append("HistoryResult = ["); sb.append("vin: "); sb.append(vin); sb.append(", sessionId: "); sb.append(sessionId); sb.append(", sessionTime: "); sb.append(sessionTime); sb.append(", histName: "); sb.append(histName); sb.append(", results: "); sb.append(results); sb.append(", analysisTime: "); sb.append(analysisTime); sb.append(", createdDate: "); sb.append(createdDate); sb.append(", createdBy: "); sb.append(createdBy); sb.append(", modifiedDate: "); sb.append(modifiedDate); sb.append(", modifiedBy: "); sb.append(modifiedBy); sb.append("]"); return sb.toString(); } } -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-questions-errors-tp9607p9900.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite with Cassandra questions / errors
Ok, I took a look at the *HistoryResult* implementation once again and found the reason. Your java class should follow JavaBeans Conventions <http://docstore.mik.ua/orelly/java-ent/jnut/ch06_02.htm> . The most important here is that you class should implement getters/setters methods for READ/WRITE properties or getters for READ-ONLY properties. In your case you just have a class with *private* members annotated by *@QuerySqlField* - this will not work. You should implement getters/setters methods for these *private* fields and then annotate getter or setter with *@QuerySqlField* -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-questions-errors-tp9607p9887.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite with Cassandra questions / errors
Ok, so I got my code working reading/inserting/updating from/to C*, but I had to manually set up the persistence.xml as below and change my java.sql.Timestamp types in the class to java.util.Date types. comment = 'Test table for Ignite/Cassandra connection' AND read_repair_chance = 0.2 -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-questions-errors-tp9607p9877.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite with Cassandra questions / errors
This is what I remember reading in the docs as well. However, I just ran the DDLGenerator using my Cassandra-persistence-settings.xml file and the following is what it generated. So, either I don't have something set up correctly for Ignite to recognize the Annotations, or there's a problem in the DDLGenerator such that the Annotations are being ignored by it. I'm perfectly willing to accept that I've done something wrong. - DDL for keyspace/table from file: resources/cassandra-persistence-settings.xml - create keyspace if not exists "dev_qlty" with replication = {'class' : 'SimpleStrategy', 'replication_factor' : 3} and durable_writes = true; create table if not exists "dev_qlty"."HistoryResult" ( "algorithmname" text, "sessionid" text, "vin" text, "createdby" text, "modifiedby" text, "results" text, primary key (("algorithmname", "sessionid", "vin")) ) with comment = 'Test table for Ignite/Cassandra connection' AND read_repair_chance = 0.2; -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-questions-errors-tp9607p9875.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite with Cassandra questions / errors
Hi, This is not actually 100% true - Cassandra integration supports @QuerySqlField annotations to create tables and doing mapping between object fields and table columns. Kenan, have you tried Cassandra DDL generator https://apacheignite-mix.readme.io/docs/ddl-generator for your persistence descriptor? It will generate DDL for your table - such way you can check if you missed something. Igor On Fri, Dec 16, 2016 at 11:01 AM, vkulichenko <valentin.kuliche...@gmail.com > wrote: > Hi, > > @QuerySqlField is annotation for Ignite SQL [1], it has nothing to do with > Cassandra integration. To specify column name which differs from Java field > name, you should use 'field' tags inside 'valuePersistence', like shown in > the example [2]. > > [1] > https://apacheignite.readme.io/v1.8/docs/indexes# > annotation-based-configuration > [2] https://apacheignite-mix.readme.io/docs/examples#example-5 > > -Val > > > > -- > View this message in context: http://apache-ignite-users. > 70518.x6.nabble.com/Ignite-with-Cassandra-questions- > errors-tp9607p9608.html > Sent from the Apache Ignite Users mailing list archive at Nabble.com. >
Re: Ignite with Cassandra questions / errors
Hi, @QuerySqlField is annotation for Ignite SQL [1], it has nothing to do with Cassandra integration. To specify column name which differs from Java field name, you should use 'field' tags inside 'valuePersistence', like shown in the example [2]. [1] https://apacheignite.readme.io/v1.8/docs/indexes#annotation-based-configuration [2] https://apacheignite-mix.readme.io/docs/examples#example-5 -Val -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-questions-errors-tp9607p9608.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Ignite with Cassandra questions / errors
Hi. I have 2 questions regarding Ignite & Cassandra. I'm using Ignite v1.8 I'm trying to get a very simple example working with read/write through to c*, but I'm having some difficulty. First, I'm trying to use the POJO strategy of configuration for both the key & value persistence with the class fieldnames slightly different than the column names, but defined in the @QuerySqlField(name="blah") annotation of the POJO. However, when it runs, I'm getting a "IgniteException: Failed to prepare Cassandra CQL statement: insert into . " and all of the column names are the lowercase names of the class field and not what is defined in the annotation. What's my problem? (configuration/class info to follow at the bottom) Second, none of my "java.sql.Timestamp" fields are showing up in the insert statement generated in the exception. My c* columns are defined as "timestamp" fields and what I've seen seems to indicate that those should be "java.sql.Timestamp" datatypes, but it's not working. How do I get my Timestamp fields to be recognized? === Ignite Exception: [Mule] 2016-12-16 09:13:21,901 WARN -com.datastax.driver.core.ReplicationStrategy$NetworkTopologyStrategy.computeTokenToReplicaMap(ReplicationStategy.java:198) - Error while computing token map for keyspace dev_cirad with datacenter dc1: could not achieve replication factor 3 (found 0 replicas only), check your keyspace replication settings. [09:13:22,324][SEVERE][main][CassandraCacheStore] Failed to execute Cassandra CQL statement: insert into "mykeyspace"."HistoryResult" ("histname", "sessionid", "vin", "createdby", "modifiedby", "results") values (?,?,?,?,?,?) using ttl 2592000; class org.apache.ignite.IgniteException: Failed to execute Cassandra CQL statement: insert into "mykeyspace"."HistoryResult" ("histname", "sessionid", "vin", "createdby", "modifiedby", "results") values (?,?,?,?,?,?) using ttl 2592000; at org.apache.ignite.cache.store.cassandra.session.CassandraSessionImpl.execute(CassandraSessionImpl.java:163) Caused by: class org.apache.ignite.IgniteException: Failed to prepare Cassandra CQL statement: insert into "mykeyspace"."HistoryResult" ("histname", "sessionid", "vin", "createdby", "modifiedby", "results") values (?,?,?,?,?,?) using ttl 2592000; at org.apache.ignite.cache.store.cassandra.session.CassandraSessionImpl.prepareStatement(CassandraSessionImpl.java:615) at org.apache.ignite.cache.store.cassandra.session.CassandraSessionImpl.execute(CassandraSessionImpl.java:133) ... 20 more Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Unknown identifier histname at com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:50) at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37) at com.datastax.driver.core.AbstractSession.prepare(AbstractSession.java:98) at org.apache.ignite.cache.store.cassandra.session.CassandraSessionImpl.prepareStatement(CassandraSessionImpl.java:597) ... 21 more === cassandra-ignite.xml http://www.springframework.org/schema/beans; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd;> 127.0.0.1:47500..47509 === cassandra-connection-settings.xml http://www.springframework.org/schema/beans; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; xmlns:util="http://www.springframework.org/schema/util; xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/util http://www.s
Re: Ignite with cassandra
Hi, i think it problem related to cassandra configuration or network, check you firewall. You can try change cassandra configuration and enable port 9160 for using. Also check which version cassandra-jdbc driver using, different version use different ports. -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-cassandra-tp8777p8881.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: spark SQL thriftserver over ignite and cassandra
>>>> It's a security issue, Ignite cache doesn't provide multiple user >>>> account per cache. I am thinking of using Spark to authenticate multiple >>>> users and then Spark use a shared account on Ignite cache >>>> >>>> >>>> Basically, Ignite provides basic security interfaces and some >>>> implementations which you can rely on by building your secure solution. >>>> This article can be useful for your case >>>> http://smartkey.co.uk/development/securing-an-apache-ignite-cluster/ >>>> >>>> — >>>> Denis >>>> >>>> >>>>> If you need a real multi-tenancy support where cacheA is allowed to be >>>>> accessed by a group of users A only and cacheB by users from group B then >>>>> you can take a look at GridGain which is built on top of Ignite >>>>> https://gridgain.readme.io/docs/multi-tenancy >>>>> >>>>> >>>>> >>>> OK but I am evaluating open source only solutions (kylin, druid, >>>> alluxio...), it's a constraint from my hierarchy >>>> >>>>> >>>>> What I want to achieve is : >>>>> - use Cassandra for data store as it provides idempotence (HDFS/hive >>>>> doesn't), resulting in exactly once semantic without any duplicates. >>>>> - use Spark SQL thriftserver in multi tenancy for large scale adhoc >>>>> analytics queries (> TB) from an ODBC driver through HTTP(S) >>>>> - accelerate Cassandra reads when the data modeling of the Cassandra >>>>> table doesn't fit the queries. Queries would be OLAP style: target >>>>> multiple >>>>> C* partitions, groupby or filters on lots of dimensions that aren't >>>>> necessarely in the C* table key. >>>>> >>>>> >>>>> As it was mentioned Ignite uses Cassandra as a CacheStore. You should >>>>> keep this in mind. Before trying to assemble all the chain I would >>>>> recommend you trying to connect Spark SQL thrift server directly to Ignite >>>>> and work with its shared RDDs [1]. A shared RDD (basically Ignite cache) >>>>> can be backed by Cassandra. Probably this chain will work for you but I >>>>> can’t give more precise guidance on this. >>>>> >>>>> >>>> I will try to make it works and give you feedback >>>> >>>> >>>> >>>>> [1] https://apacheignite-fs.readme.io/docs/ignite-for-spark >>>>> >>>>> — >>>>> Denis >>>>> >>>>> Thanks for your advises >>>>> >>>>> >>>>> 2016-10-04 6:51 GMT+02:00 Jörn Franke <jornfra...@gmail.com>: >>>>> >>>>>> I am not sure that this will be performant. What do you want to >>>>>> achieve here? Fast lookups? Then the Cassandra Ignite store might be the >>>>>> right solution. If you want to do more analytic style of queries then you >>>>>> can put the data on HDFS/Hive and use the Ignite HDFS cache to cache >>>>>> certain partitions/tables in Hive in-memory. If you want to go to >>>>>> iterative >>>>>> machine learning algorithms you can go for Spark on top of this. You can >>>>>> use then also Ignite cache for Spark RDDs. >>>>>> >>>>>> On 4 Oct 2016, at 02:24, Alexey Kuznetsov <akuznet...@gridgain.com> >>>>>> wrote: >>>>>> >>>>>> Hi, Vincent! >>>>>> >>>>>> Ignite also has SQL support (also scalable), I think it will be much >>>>>> faster to query directly from Ignite than query from Spark. >>>>>> Also please mind, that before executing queries you should load all >>>>>> needed data to cache. >>>>>> To load data from Cassandra to Ignite you may use Cassandra store [1]. >>>>>> >>>>>> [1] https://apacheignite.readme.io/docs/ignite-with-apache-cassandra >>>>>> >>>>>> On Tue, Oct 4, 2016 at 4:19 AM, vincent gromakowski < >>>>>> vincent.gromakow...@gmail.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> I am evaluating the possibility to use Spark SQL (and its >>>>>>> scalability) over an Ignite cache with Cassandra persistent store to >>>>>>> increase read workloads like OLAP style analytics. >>>>>>> Is there any way to configure Spark thriftserver to load an external >>>>>>> table in Ignite like we can do in Cassandra ? >>>>>>> Here is an example of config for spark backed by cassandra >>>>>>> >>>>>>> CREATE EXTERNAL TABLE MyHiveTable >>>>>>> ( id int, data string ) >>>>>>> STORED BY 'org.apache.hadoop.hive.cassan >>>>>>> dra.cql.CqlStorageHandler' >>>>>>> TBLPROPERTIES ("cassandra.host" = "x.x.x.x", " >>>>>>> cassandra.ks.name" = "test" , >>>>>>> "cassandra.cf.name" = "mytable" , >>>>>>> "cassandra.ks.repfactor" = "1" , >>>>>>> "cassandra.ks.strategy" = >>>>>>> "org.apache.cassandra.locator.SimpleStrategy" ); >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Alexey Kuznetsov >>>>>> >>>>>> >>>> >> >
Re: spark SQL thriftserver over ignite and cassandra
Hi I mean using HTTPS transport instead of binary (thrift?) transport. 2016-10-17 19:10 GMT+02:00 Igor Sapego <isap...@gridgain.com>: > Hi Vincent, > > Can you please explain what do you mean by HTTP(S) support for the ODBC? > > I'm not quite sure I get it. > > Best Regards, > Igor > > On Thu, Oct 6, 2016 at 9:59 AM, vincent gromakowski < > vincent.gromakow...@gmail.com> wrote: > >> Thanks >> >> Starting the thriftserver with igniterdd tables doesn't seem very hard. >> Implementing a security layer over ignite cache may be harder as I need to: >> - get username from thriftserver >> - intercept each request and check permissions >> Maybe spark will also be able to handle permissions... >> >> I will keep you informed >> >> Le 6 oct. 2016 00:12, "Denis Magda" <dma...@gridgain.com> a écrit : >> >>> Vincent, >>> >>> Please see below >>> >>> On Oct 5, 2016, at 4:31 AM, vincent gromakowski < >>> vincent.gromakow...@gmail.com> wrote: >>> >>> Hi >>> thanks for your explanations. Please find inline more questions >>> >>> Vincent >>> >>> 2016-10-05 3:33 GMT+02:00 Denis Magda <dma...@gridgain.com>: >>> >>>> Hi Vincent, >>>> >>>> See my answers inline >>>> >>>> On Oct 4, 2016, at 12:54 AM, vincent gromakowski < >>>> vincent.gromakow...@gmail.com> wrote: >>>> >>>> Hi, >>>> I know that Ignite has SQL support but: >>>> - ODBC driver doesn't seem to provide HTTP(S) support, which is easier >>>> to integrate on corporate networks with rules, firewalls, proxies >>>> >>>> >>>> *Igor Sapego*, what URIs are supported presently? >>>> >>>> - The SQL engine doesn't seem to scale like Spark SQL would. For >>>> instance, Spark won't generate OOM is dataset (source or result) doesn't >>>> fit in memory. From Ignite side, it's not clear… >>>> >>>> >>>> OOM is not related to scalability topic at all. This is about >>>> application’s logic. >>>> >>>> Ignite SQL engine perfectly scales out along with your cluster. >>>> Moreover, Ignite supports indexes which allows you to get O(logN) running >>>> time complexity for your SQL queries while in case of Spark you will face >>>> with full-scans (O(N)) all the time. >>>> >>>> However, to benefit from Ignite SQL queries you have to put all the >>>> data in-memory. Ignite doesn’t go to a CacheStore (Cassandra, relational >>>> database, MongoDB, etc) while a SQL query is executed and won’t preload >>>> anything from an underlying CacheStore. Automatic preloading works for >>>> key-value queries like cache.get(key). >>>> >>> >>> >>> This is an issue because I will potentially have to query TB of data. If >>> I use Spark thriftserver backed by IgniteRDD, does it solve this point and >>> can I get automatic preloading from C* ? >>> >>> >>> IgniteRDD will load missing tuples (key-value) pair from Cassandra >>> because essentially IgniteRDD is an IgniteCache and Cassandra is a >>> CacheStore. The only thing that is left to check is whether Spark >>> triftserver can work with IgniteRDDs. Hope you will be able figure out this >>> and share your feedback with us. >>> >>> >>> >>>> - Spark thrift can manage multi tenancy: different users can connect to >>>> the same SQL engine and share cache. In Ignite it's one cache per user, so >>>> a big waste of RAM. >>>> >>>> >>>> Everyone can connect to an Ignite cluster and work with the same set of >>>> distributed caches. I’m not sure why you need to create caches with the >>>> same content for every user. >>>> >>> >>> It's a security issue, Ignite cache doesn't provide multiple user >>> account per cache. I am thinking of using Spark to authenticate multiple >>> users and then Spark use a shared account on Ignite cache >>> >>> >>> Basically, Ignite provides basic security interfaces and some >>> implementations which you can rely on by building your secure solution. >>> This article can be useful for your case >>> http://smartkey.co.uk/development/securing-an-apache-ignite-cluster/ >>> >>> — >>> Denis >>> >>&g
Re: spark SQL thriftserver over ignite and cassandra
Hi Vincent, Can you please explain what do you mean by HTTP(S) support for the ODBC? I'm not quite sure I get it. Best Regards, Igor On Thu, Oct 6, 2016 at 9:59 AM, vincent gromakowski < vincent.gromakow...@gmail.com> wrote: > Thanks > > Starting the thriftserver with igniterdd tables doesn't seem very hard. > Implementing a security layer over ignite cache may be harder as I need to: > - get username from thriftserver > - intercept each request and check permissions > Maybe spark will also be able to handle permissions... > > I will keep you informed > > Le 6 oct. 2016 00:12, "Denis Magda" <dma...@gridgain.com> a écrit : > >> Vincent, >> >> Please see below >> >> On Oct 5, 2016, at 4:31 AM, vincent gromakowski < >> vincent.gromakow...@gmail.com> wrote: >> >> Hi >> thanks for your explanations. Please find inline more questions >> >> Vincent >> >> 2016-10-05 3:33 GMT+02:00 Denis Magda <dma...@gridgain.com>: >> >>> Hi Vincent, >>> >>> See my answers inline >>> >>> On Oct 4, 2016, at 12:54 AM, vincent gromakowski < >>> vincent.gromakow...@gmail.com> wrote: >>> >>> Hi, >>> I know that Ignite has SQL support but: >>> - ODBC driver doesn't seem to provide HTTP(S) support, which is easier >>> to integrate on corporate networks with rules, firewalls, proxies >>> >>> >>> *Igor Sapego*, what URIs are supported presently? >>> >>> - The SQL engine doesn't seem to scale like Spark SQL would. For >>> instance, Spark won't generate OOM is dataset (source or result) doesn't >>> fit in memory. From Ignite side, it's not clear… >>> >>> >>> OOM is not related to scalability topic at all. This is about >>> application’s logic. >>> >>> Ignite SQL engine perfectly scales out along with your cluster. >>> Moreover, Ignite supports indexes which allows you to get O(logN) running >>> time complexity for your SQL queries while in case of Spark you will face >>> with full-scans (O(N)) all the time. >>> >>> However, to benefit from Ignite SQL queries you have to put all the data >>> in-memory. Ignite doesn’t go to a CacheStore (Cassandra, relational >>> database, MongoDB, etc) while a SQL query is executed and won’t preload >>> anything from an underlying CacheStore. Automatic preloading works for >>> key-value queries like cache.get(key). >>> >> >> >> This is an issue because I will potentially have to query TB of data. If >> I use Spark thriftserver backed by IgniteRDD, does it solve this point and >> can I get automatic preloading from C* ? >> >> >> IgniteRDD will load missing tuples (key-value) pair from Cassandra >> because essentially IgniteRDD is an IgniteCache and Cassandra is a >> CacheStore. The only thing that is left to check is whether Spark >> triftserver can work with IgniteRDDs. Hope you will be able figure out this >> and share your feedback with us. >> >> >> >>> - Spark thrift can manage multi tenancy: different users can connect to >>> the same SQL engine and share cache. In Ignite it's one cache per user, so >>> a big waste of RAM. >>> >>> >>> Everyone can connect to an Ignite cluster and work with the same set of >>> distributed caches. I’m not sure why you need to create caches with the >>> same content for every user. >>> >> >> It's a security issue, Ignite cache doesn't provide multiple user account >> per cache. I am thinking of using Spark to authenticate multiple users and >> then Spark use a shared account on Ignite cache >> >> >> Basically, Ignite provides basic security interfaces and some >> implementations which you can rely on by building your secure solution. >> This article can be useful for your case >> http://smartkey.co.uk/development/securing-an-apache-ignite-cluster/ >> >> — >> Denis >> >> >>> If you need a real multi-tenancy support where cacheA is allowed to be >>> accessed by a group of users A only and cacheB by users from group B then >>> you can take a look at GridGain which is built on top of Ignite >>> https://gridgain.readme.io/docs/multi-tenancy >>> >>> >>> >> OK but I am evaluating open source only solutions (kylin, druid, >> alluxio...), it's a constraint from my hierarchy >> >>> >>> What I want to achieve is : >>> - use Cassandra for data store a
Re: spark SQL thriftserver over ignite and cassandra
Thanks Starting the thriftserver with igniterdd tables doesn't seem very hard. Implementing a security layer over ignite cache may be harder as I need to: - get username from thriftserver - intercept each request and check permissions Maybe spark will also be able to handle permissions... I will keep you informed Le 6 oct. 2016 00:12, "Denis Magda" <dma...@gridgain.com> a écrit : > Vincent, > > Please see below > > On Oct 5, 2016, at 4:31 AM, vincent gromakowski < > vincent.gromakow...@gmail.com> wrote: > > Hi > thanks for your explanations. Please find inline more questions > > Vincent > > 2016-10-05 3:33 GMT+02:00 Denis Magda <dma...@gridgain.com>: > >> Hi Vincent, >> >> See my answers inline >> >> On Oct 4, 2016, at 12:54 AM, vincent gromakowski < >> vincent.gromakow...@gmail.com> wrote: >> >> Hi, >> I know that Ignite has SQL support but: >> - ODBC driver doesn't seem to provide HTTP(S) support, which is easier to >> integrate on corporate networks with rules, firewalls, proxies >> >> >> *Igor Sapego*, what URIs are supported presently? >> >> - The SQL engine doesn't seem to scale like Spark SQL would. For >> instance, Spark won't generate OOM is dataset (source or result) doesn't >> fit in memory. From Ignite side, it's not clear… >> >> >> OOM is not related to scalability topic at all. This is about >> application’s logic. >> >> Ignite SQL engine perfectly scales out along with your cluster. Moreover, >> Ignite supports indexes which allows you to get O(logN) running time >> complexity for your SQL queries while in case of Spark you will face with >> full-scans (O(N)) all the time. >> >> However, to benefit from Ignite SQL queries you have to put all the data >> in-memory. Ignite doesn’t go to a CacheStore (Cassandra, relational >> database, MongoDB, etc) while a SQL query is executed and won’t preload >> anything from an underlying CacheStore. Automatic preloading works for >> key-value queries like cache.get(key). >> > > > This is an issue because I will potentially have to query TB of data. If I > use Spark thriftserver backed by IgniteRDD, does it solve this point and > can I get automatic preloading from C* ? > > > IgniteRDD will load missing tuples (key-value) pair from Cassandra because > essentially IgniteRDD is an IgniteCache and Cassandra is a CacheStore. The > only thing that is left to check is whether Spark triftserver can work with > IgniteRDDs. Hope you will be able figure out this and share your feedback > with us. > > > >> - Spark thrift can manage multi tenancy: different users can connect to >> the same SQL engine and share cache. In Ignite it's one cache per user, so >> a big waste of RAM. >> >> >> Everyone can connect to an Ignite cluster and work with the same set of >> distributed caches. I’m not sure why you need to create caches with the >> same content for every user. >> > > It's a security issue, Ignite cache doesn't provide multiple user account > per cache. I am thinking of using Spark to authenticate multiple users and > then Spark use a shared account on Ignite cache > > > Basically, Ignite provides basic security interfaces and some > implementations which you can rely on by building your secure solution. > This article can be useful for your case > http://smartkey.co.uk/development/securing-an-apache-ignite-cluster/ > > — > Denis > > >> If you need a real multi-tenancy support where cacheA is allowed to be >> accessed by a group of users A only and cacheB by users from group B then >> you can take a look at GridGain which is built on top of Ignite >> https://gridgain.readme.io/docs/multi-tenancy >> >> >> > OK but I am evaluating open source only solutions (kylin, druid, > alluxio...), it's a constraint from my hierarchy > >> >> What I want to achieve is : >> - use Cassandra for data store as it provides idempotence (HDFS/hive >> doesn't), resulting in exactly once semantic without any duplicates. >> - use Spark SQL thriftserver in multi tenancy for large scale adhoc >> analytics queries (> TB) from an ODBC driver through HTTP(S) >> - accelerate Cassandra reads when the data modeling of the Cassandra >> table doesn't fit the queries. Queries would be OLAP style: target multiple >> C* partitions, groupby or filters on lots of dimensions that aren't >> necessarely in the C* table key. >> >> >> As it was mentioned Ignite uses Cassandra as a CacheStore. You should >> keep this in mind. Before trying to assemble
Re: spark SQL thriftserver over ignite and cassandra
Vincent, Please see below > On Oct 5, 2016, at 4:31 AM, vincent gromakowski > <vincent.gromakow...@gmail.com> wrote: > > Hi > thanks for your explanations. Please find inline more questions > > Vincent > > 2016-10-05 3:33 GMT+02:00 Denis Magda <dma...@gridgain.com > <mailto:dma...@gridgain.com>>: > Hi Vincent, > > See my answers inline > >> On Oct 4, 2016, at 12:54 AM, vincent gromakowski >> <vincent.gromakow...@gmail.com <mailto:vincent.gromakow...@gmail.com>> wrote: >> >> Hi, >> I know that Ignite has SQL support but: >> - ODBC driver doesn't seem to provide HTTP(S) support, which is easier to >> integrate on corporate networks with rules, firewalls, proxies > > Igor Sapego, what URIs are supported presently? > >> - The SQL engine doesn't seem to scale like Spark SQL would. For instance, >> Spark won't generate OOM is dataset (source or result) doesn't fit in >> memory. From Ignite side, it's not clear… > > OOM is not related to scalability topic at all. This is about application’s > logic. > > Ignite SQL engine perfectly scales out along with your cluster. Moreover, > Ignite supports indexes which allows you to get O(logN) running time > complexity for your SQL queries while in case of Spark you will face with > full-scans (O(N)) all the time. > > However, to benefit from Ignite SQL queries you have to put all the data > in-memory. Ignite doesn’t go to a CacheStore (Cassandra, relational database, > MongoDB, etc) while a SQL query is executed and won’t preload anything from > an underlying CacheStore. Automatic preloading works for key-value queries > like cache.get(key). > > > This is an issue because I will potentially have to query TB of data. If I > use Spark thriftserver backed by IgniteRDD, does it solve this point and can > I get automatic preloading from C* ? IgniteRDD will load missing tuples (key-value) pair from Cassandra because essentially IgniteRDD is an IgniteCache and Cassandra is a CacheStore. The only thing that is left to check is whether Spark triftserver can work with IgniteRDDs. Hope you will be able figure out this and share your feedback with us. > >> - Spark thrift can manage multi tenancy: different users can connect to the >> same SQL engine and share cache. In Ignite it's one cache per user, so a big >> waste of RAM. > > Everyone can connect to an Ignite cluster and work with the same set of > distributed caches. I’m not sure why you need to create caches with the same > content for every user. > > It's a security issue, Ignite cache doesn't provide multiple user account per > cache. I am thinking of using Spark to authenticate multiple users and then > Spark use a shared account on Ignite cache > Basically, Ignite provides basic security interfaces and some implementations which you can rely on by building your secure solution. This article can be useful for your case http://smartkey.co.uk/development/securing-an-apache-ignite-cluster/ <http://smartkey.co.uk/development/securing-an-apache-ignite-cluster/> — Denis > > If you need a real multi-tenancy support where cacheA is allowed to be > accessed by a group of users A only and cacheB by users from group B then you > can take a look at GridGain which is built on top of Ignite > https://gridgain.readme.io/docs/multi-tenancy > <https://gridgain.readme.io/docs/multi-tenancy> > > > > OK but I am evaluating open source only solutions (kylin, druid, alluxio...), > it's a constraint from my hierarchy >> >> What I want to achieve is : >> - use Cassandra for data store as it provides idempotence (HDFS/hive >> doesn't), resulting in exactly once semantic without any duplicates. >> - use Spark SQL thriftserver in multi tenancy for large scale adhoc >> analytics queries (> TB) from an ODBC driver through HTTP(S) >> - accelerate Cassandra reads when the data modeling of the Cassandra table >> doesn't fit the queries. Queries would be OLAP style: target multiple C* >> partitions, groupby or filters on lots of dimensions that aren't necessarely >> in the C* table key. >> > > As it was mentioned Ignite uses Cassandra as a CacheStore. You should keep > this in mind. Before trying to assemble all the chain I would recommend you > trying to connect Spark SQL thrift server directly to Ignite and work with > its shared RDDs [1]. A shared RDD (basically Ignite cache) can be backed by > Cassandra. Probably this chain will work for you but I can’t give more > precise guidance on this. > > > I will try to make it works and give you feedback > > > [1] https://apacheignite
Re: spark SQL thriftserver over ignite and cassandra
Hi thanks for your explanations. Please find inline more questions Vincent 2016-10-05 3:33 GMT+02:00 Denis Magda <dma...@gridgain.com>: > Hi Vincent, > > See my answers inline > > On Oct 4, 2016, at 12:54 AM, vincent gromakowski < > vincent.gromakow...@gmail.com> wrote: > > Hi, > I know that Ignite has SQL support but: > - ODBC driver doesn't seem to provide HTTP(S) support, which is easier to > integrate on corporate networks with rules, firewalls, proxies > > > *Igor Sapego*, what URIs are supported presently? > > - The SQL engine doesn't seem to scale like Spark SQL would. For instance, > Spark won't generate OOM is dataset (source or result) doesn't fit in > memory. From Ignite side, it's not clear… > > > OOM is not related to scalability topic at all. This is about > application’s logic. > > Ignite SQL engine perfectly scales out along with your cluster. Moreover, > Ignite supports indexes which allows you to get O(logN) running time > complexity for your SQL queries while in case of Spark you will face with > full-scans (O(N)) all the time. > > However, to benefit from Ignite SQL queries you have to put all the data > in-memory. Ignite doesn’t go to a CacheStore (Cassandra, relational > database, MongoDB, etc) while a SQL query is executed and won’t preload > anything from an underlying CacheStore. Automatic preloading works for > key-value queries like cache.get(key). > This is an issue because I will potentially have to query TB of data. If I use Spark thriftserver backed by IgniteRDD, does it solve this point and can I get automatic preloading from C* ? > > - Spark thrift can manage multi tenancy: different users can connect to > the same SQL engine and share cache. In Ignite it's one cache per user, so > a big waste of RAM. > > > Everyone can connect to an Ignite cluster and work with the same set of > distributed caches. I’m not sure why you need to create caches with the > same content for every user. > It's a security issue, Ignite cache doesn't provide multiple user account per cache. I am thinking of using Spark to authenticate multiple users and then Spark use a shared account on Ignite cache > > If you need a real multi-tenancy support where cacheA is allowed to be > accessed by a group of users A only and cacheB by users from group B then > you can take a look at GridGain which is built on top of Ignite > https://gridgain.readme.io/docs/multi-tenancy > > > OK but I am evaluating open source only solutions (kylin, druid, alluxio...), it's a constraint from my hierarchy > > What I want to achieve is : > - use Cassandra for data store as it provides idempotence (HDFS/hive > doesn't), resulting in exactly once semantic without any duplicates. > - use Spark SQL thriftserver in multi tenancy for large scale adhoc > analytics queries (> TB) from an ODBC driver through HTTP(S) > - accelerate Cassandra reads when the data modeling of the Cassandra table > doesn't fit the queries. Queries would be OLAP style: target multiple C* > partitions, groupby or filters on lots of dimensions that aren't > necessarely in the C* table key. > > > As it was mentioned Ignite uses Cassandra as a CacheStore. You should keep > this in mind. Before trying to assemble all the chain I would recommend you > trying to connect Spark SQL thrift server directly to Ignite and work with > its shared RDDs [1]. A shared RDD (basically Ignite cache) can be backed by > Cassandra. Probably this chain will work for you but I can’t give more > precise guidance on this. > > I will try to make it works and give you feedback > [1] https://apacheignite-fs.readme.io/docs/ignite-for-spark > > — > Denis > > Thanks for your advises > > > 2016-10-04 6:51 GMT+02:00 Jörn Franke <jornfra...@gmail.com>: > >> I am not sure that this will be performant. What do you want to achieve >> here? Fast lookups? Then the Cassandra Ignite store might be the right >> solution. If you want to do more analytic style of queries then you can put >> the data on HDFS/Hive and use the Ignite HDFS cache to cache certain >> partitions/tables in Hive in-memory. If you want to go to iterative machine >> learning algorithms you can go for Spark on top of this. You can use then >> also Ignite cache for Spark RDDs. >> >> On 4 Oct 2016, at 02:24, Alexey Kuznetsov <akuznet...@gridgain.com> >> wrote: >> >> Hi, Vincent! >> >> Ignite also has SQL support (also scalable), I think it will be much >> faster to query directly from Ignite than query from Spark. >> Also please mind, that before executing queries you should load all >> needed data to cache. >> To load data from Cassand
Re: spark SQL thriftserver over ignite and cassandra
Hi Vincent, See my answers inline > On Oct 4, 2016, at 12:54 AM, vincent gromakowski > <vincent.gromakow...@gmail.com> wrote: > > Hi, > I know that Ignite has SQL support but: > - ODBC driver doesn't seem to provide HTTP(S) support, which is easier to > integrate on corporate networks with rules, firewalls, proxies Igor Sapego, what URIs are supported presently? > - The SQL engine doesn't seem to scale like Spark SQL would. For instance, > Spark won't generate OOM is dataset (source or result) doesn't fit in memory. > From Ignite side, it's not clear… OOM is not related to scalability topic at all. This is about application’s logic. Ignite SQL engine perfectly scales out along with your cluster. Moreover, Ignite supports indexes which allows you to get O(logN) running time complexity for your SQL queries while in case of Spark you will face with full-scans (O(N)) all the time. However, to benefit from Ignite SQL queries you have to put all the data in-memory. Ignite doesn’t go to a CacheStore (Cassandra, relational database, MongoDB, etc) while a SQL query is executed and won’t preload anything from an underlying CacheStore. Automatic preloading works for key-value queries like cache.get(key). > - Spark thrift can manage multi tenancy: different users can connect to the > same SQL engine and share cache. In Ignite it's one cache per user, so a big > waste of RAM. Everyone can connect to an Ignite cluster and work with the same set of distributed caches. I’m not sure why you need to create caches with the same content for every user. If you need a real multi-tenancy support where cacheA is allowed to be accessed by a group of users A only and cacheB by users from group B then you can take a look at GridGain which is built on top of Ignite https://gridgain.readme.io/docs/multi-tenancy > > What I want to achieve is : > - use Cassandra for data store as it provides idempotence (HDFS/hive > doesn't), resulting in exactly once semantic without any duplicates. > - use Spark SQL thriftserver in multi tenancy for large scale adhoc analytics > queries (> TB) from an ODBC driver through HTTP(S) > - accelerate Cassandra reads when the data modeling of the Cassandra table > doesn't fit the queries. Queries would be OLAP style: target multiple C* > partitions, groupby or filters on lots of dimensions that aren't necessarely > in the C* table key. > As it was mentioned Ignite uses Cassandra as a CacheStore. You should keep this in mind. Before trying to assemble all the chain I would recommend you trying to connect Spark SQL thrift server directly to Ignite and work with its shared RDDs [1]. A shared RDD (basically Ignite cache) can be backed by Cassandra. Probably this chain will work for you but I can’t give more precise guidance on this. [1] https://apacheignite-fs.readme.io/docs/ignite-for-spark — Denis > Thanks for your advises > > > 2016-10-04 6:51 GMT+02:00 Jörn Franke <jornfra...@gmail.com > <mailto:jornfra...@gmail.com>>: > I am not sure that this will be performant. What do you want to achieve here? > Fast lookups? Then the Cassandra Ignite store might be the right solution. If > you want to do more analytic style of queries then you can put the data on > HDFS/Hive and use the Ignite HDFS cache to cache certain partitions/tables in > Hive in-memory. If you want to go to iterative machine learning algorithms > you can go for Spark on top of this. You can use then also Ignite cache for > Spark RDDs. > > On 4 Oct 2016, at 02:24, Alexey Kuznetsov <akuznet...@gridgain.com > <mailto:akuznet...@gridgain.com>> wrote: > >> Hi, Vincent! >> >> Ignite also has SQL support (also scalable), I think it will be much faster >> to query directly from Ignite than query from Spark. >> Also please mind, that before executing queries you should load all needed >> data to cache. >> To load data from Cassandra to Ignite you may use Cassandra store [1]. >> >> [1] https://apacheignite.readme.io/docs/ignite-with-apache-cassandra >> <https://apacheignite.readme.io/docs/ignite-with-apache-cassandra> >> >> On Tue, Oct 4, 2016 at 4:19 AM, vincent gromakowski >> <vincent.gromakow...@gmail.com <mailto:vincent.gromakow...@gmail.com>> wrote: >> Hi, >> I am evaluating the possibility to use Spark SQL (and its scalability) over >> an Ignite cache with Cassandra persistent store to increase read workloads >> like OLAP style analytics. >> Is there any way to configure Spark thriftserver to load an external table >> in Ignite like we can do in Cassandra ? >> Here is an example of config for spark backed by cassandra >> >> CREATE EXTERNAL TABLE MyHiveTable >
Re: spark SQL thriftserver over ignite and cassandra
Do you have any remark/correction on my assumptions ? Le 4 oct. 2016 9:54 AM, "vincent gromakowski" <vincent.gromakow...@gmail.com> a écrit : > Hi, > I know that Ignite has SQL support but: > - ODBC driver doesn't seem to provide HTTP(S) support, which is easier to > integrate on corporate networks with rules, firewalls, proxies > - The SQL engine doesn't seem to scale like Spark SQL would. For instance, > Spark won't generate OOM is dataset (source or result) doesn't fit in > memory. From Ignite side, it's not clear... > - Spark thrift can manage multi tenancy: different users can connect to > the same SQL engine and share cache. In Ignite it's one cache per user, so > a big waste of RAM. > > What I want to achieve is : > - use Cassandra for data store as it provides idempotence (HDFS/hive > doesn't), resulting in exactly once semantic without any duplicates. > - use Spark SQL thriftserver in multi tenancy for large scale adhoc > analytics queries (> TB) from an ODBC driver through HTTP(S) > - accelerate Cassandra reads when the data modeling of the Cassandra table > doesn't fit the queries. Queries would be OLAP style: target multiple C* > partitions, groupby or filters on lots of dimensions that aren't > necessarely in the C* table key. > > Thanks for your advises > > > 2016-10-04 6:51 GMT+02:00 Jörn Franke <jornfra...@gmail.com>: > >> I am not sure that this will be performant. What do you want to achieve >> here? Fast lookups? Then the Cassandra Ignite store might be the right >> solution. If you want to do more analytic style of queries then you can put >> the data on HDFS/Hive and use the Ignite HDFS cache to cache certain >> partitions/tables in Hive in-memory. If you want to go to iterative machine >> learning algorithms you can go for Spark on top of this. You can use then >> also Ignite cache for Spark RDDs. >> >> On 4 Oct 2016, at 02:24, Alexey Kuznetsov <akuznet...@gridgain.com> >> wrote: >> >> Hi, Vincent! >> >> Ignite also has SQL support (also scalable), I think it will be much >> faster to query directly from Ignite than query from Spark. >> Also please mind, that before executing queries you should load all >> needed data to cache. >> To load data from Cassandra to Ignite you may use Cassandra store [1]. >> >> [1] https://apacheignite.readme.io/docs/ignite-with-apache-cassandra >> >> On Tue, Oct 4, 2016 at 4:19 AM, vincent gromakowski < >> vincent.gromakow...@gmail.com> wrote: >> >>> Hi, >>> I am evaluating the possibility to use Spark SQL (and its scalability) >>> over an Ignite cache with Cassandra persistent store to increase read >>> workloads like OLAP style analytics. >>> Is there any way to configure Spark thriftserver to load an external >>> table in Ignite like we can do in Cassandra ? >>> Here is an example of config for spark backed by cassandra >>> >>> CREATE EXTERNAL TABLE MyHiveTable >>> ( id int, data string ) >>> STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' >>> >>> TBLPROPERTIES ("cassandra.host" = "x.x.x.x", "cassandra.ks.name" >>> = "test" , >>> "cassandra.cf.name" = "mytable" , >>> "cassandra.ks.repfactor" = "1" , >>> "cassandra.ks.strategy" = >>> "org.apache.cassandra.locator.SimpleStrategy" ); >>> >>> >> >> >> -- >> Alexey Kuznetsov >> >> >
Re: spark SQL thriftserver over ignite and cassandra
Hi, I know that Ignite has SQL support but: - ODBC driver doesn't seem to provide HTTP(S) support, which is easier to integrate on corporate networks with rules, firewalls, proxies - The SQL engine doesn't seem to scale like Spark SQL would. For instance, Spark won't generate OOM is dataset (source or result) doesn't fit in memory. From Ignite side, it's not clear... - Spark thrift can manage multi tenancy: different users can connect to the same SQL engine and share cache. In Ignite it's one cache per user, so a big waste of RAM. What I want to achieve is : - use Cassandra for data store as it provides idempotence (HDFS/hive doesn't), resulting in exactly once semantic without any duplicates. - use Spark SQL thriftserver in multi tenancy for large scale adhoc analytics queries (> TB) from an ODBC driver through HTTP(S) - accelerate Cassandra reads when the data modeling of the Cassandra table doesn't fit the queries. Queries would be OLAP style: target multiple C* partitions, groupby or filters on lots of dimensions that aren't necessarely in the C* table key. Thanks for your advises 2016-10-04 6:51 GMT+02:00 Jörn Franke <jornfra...@gmail.com>: > I am not sure that this will be performant. What do you want to achieve > here? Fast lookups? Then the Cassandra Ignite store might be the right > solution. If you want to do more analytic style of queries then you can put > the data on HDFS/Hive and use the Ignite HDFS cache to cache certain > partitions/tables in Hive in-memory. If you want to go to iterative machine > learning algorithms you can go for Spark on top of this. You can use then > also Ignite cache for Spark RDDs. > > On 4 Oct 2016, at 02:24, Alexey Kuznetsov <akuznet...@gridgain.com> wrote: > > Hi, Vincent! > > Ignite also has SQL support (also scalable), I think it will be much > faster to query directly from Ignite than query from Spark. > Also please mind, that before executing queries you should load all needed > data to cache. > To load data from Cassandra to Ignite you may use Cassandra store [1]. > > [1] https://apacheignite.readme.io/docs/ignite-with-apache-cassandra > > On Tue, Oct 4, 2016 at 4:19 AM, vincent gromakowski < > vincent.gromakow...@gmail.com> wrote: > >> Hi, >> I am evaluating the possibility to use Spark SQL (and its scalability) >> over an Ignite cache with Cassandra persistent store to increase read >> workloads like OLAP style analytics. >> Is there any way to configure Spark thriftserver to load an external >> table in Ignite like we can do in Cassandra ? >> Here is an example of config for spark backed by cassandra >> >> CREATE EXTERNAL TABLE MyHiveTable >> ( id int, data string ) >> STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' >> >> TBLPROPERTIES ("cassandra.host" = "x.x.x.x", "cassandra.ks.name" >> = "test" , >> "cassandra.cf.name" = "mytable" , >> "cassandra.ks.repfactor" = "1" , >> "cassandra.ks.strategy" = >> "org.apache.cassandra.locator.SimpleStrategy" ); >> >> > > > -- > Alexey Kuznetsov > >
Re: spark SQL thriftserver over ignite and cassandra
I am not sure that this will be performant. What do you want to achieve here? Fast lookups? Then the Cassandra Ignite store might be the right solution. If you want to do more analytic style of queries then you can put the data on HDFS/Hive and use the Ignite HDFS cache to cache certain partitions/tables in Hive in-memory. If you want to go to iterative machine learning algorithms you can go for Spark on top of this. You can use then also Ignite cache for Spark RDDs. > On 4 Oct 2016, at 02:24, Alexey Kuznetsov <akuznet...@gridgain.com> wrote: > > Hi, Vincent! > > Ignite also has SQL support (also scalable), I think it will be much faster > to query directly from Ignite than query from Spark. > Also please mind, that before executing queries you should load all needed > data to cache. > To load data from Cassandra to Ignite you may use Cassandra store [1]. > > [1] https://apacheignite.readme.io/docs/ignite-with-apache-cassandra > >> On Tue, Oct 4, 2016 at 4:19 AM, vincent gromakowski >> <vincent.gromakow...@gmail.com> wrote: >> Hi, >> I am evaluating the possibility to use Spark SQL (and its scalability) over >> an Ignite cache with Cassandra persistent store to increase read workloads >> like OLAP style analytics. >> Is there any way to configure Spark thriftserver to load an external table >> in Ignite like we can do in Cassandra ? >> Here is an example of config for spark backed by cassandra >> >> CREATE EXTERNAL TABLE MyHiveTable >> ( id int, data string ) >> STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' >> TBLPROPERTIES ("cassandra.host" = "x.x.x.x", "cassandra.ks.name" = >> "test" , >> "cassandra.cf.name" = "mytable" , >> "cassandra.ks.repfactor" = "1" , >> "cassandra.ks.strategy" = >> "org.apache.cassandra.locator.SimpleStrategy" ); >> > > > > -- > Alexey Kuznetsov
Re: spark SQL thriftserver over ignite and cassandra
Hi, Vincent! Ignite also has SQL support (also scalable), I think it will be much faster to query directly from Ignite than query from Spark. Also please mind, that before executing queries you should load all needed data to cache. To load data from Cassandra to Ignite you may use Cassandra store [1]. [1] https://apacheignite.readme.io/docs/ignite-with-apache-cassandra On Tue, Oct 4, 2016 at 4:19 AM, vincent gromakowski < vincent.gromakow...@gmail.com> wrote: > Hi, > I am evaluating the possibility to use Spark SQL (and its scalability) > over an Ignite cache with Cassandra persistent store to increase read > workloads like OLAP style analytics. > Is there any way to configure Spark thriftserver to load an external table > in Ignite like we can do in Cassandra ? > Here is an example of config for spark backed by cassandra > > CREATE EXTERNAL TABLE MyHiveTable > ( id int, data string ) > STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' > > TBLPROPERTIES ("cassandra.host" = "x.x.x.x", "cassandra.ks.name" > = "test" , > "cassandra.cf.name" = "mytable" , > "cassandra.ks.repfactor" = "1" , > "cassandra.ks.strategy" = > "org.apache.cassandra.locator.SimpleStrategy" ); > > -- Alexey Kuznetsov
spark SQL thriftserver over ignite and cassandra
Hi, I am evaluating the possibility to use Spark SQL (and its scalability) over an Ignite cache with Cassandra persistent store to increase read workloads like OLAP style analytics. Is there any way to configure Spark thriftserver to load an external table in Ignite like we can do in Cassandra ? Here is an example of config for spark backed by cassandra CREATE EXTERNAL TABLE MyHiveTable ( id int, data string ) STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' TBLPROPERTIES ("cassandra.host" = "x.x.x.x", "cassandra.ks.name" = "test" , "cassandra.cf.name" = "mytable" , "cassandra.ks.repfactor" = "1" , "cassandra.ks.strategy" = "org.apache.cassandra.locator.SimpleStrategy" );
Re: Ignite with Cassandra
The reason is in *"id"* field. According to the persistence descriptor, cache key will be stored in "id" field, but at the same time User POJO class also has such field. There are several options to fix this: 1) Specify another column mapping for Ignite cache key. For example: ** 2) Specify non default column mapping for "id" field in User class. Here are the options to do this: a) Mark "id" field by *@QuerySqlField* annotation and specify name which is differ than "id". For example: *@QuerySqlField(name="userId")* b) Manually specify columns mapping for User class in xml persistence descriptor and make sure that "id" field is mapped to something differ that "id". For example: * * 3) Manually specify columns mapping for User class in xml persistence descriptor and omit "id" field - such a way "id" field from User class simply will not be persisted into Cassandra table. Which makes sense if you already have absolutely the same value for Ignite cache key - you don't need to save the same value twice into two different columns. Example: * * -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-tp7242p7395.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite with Cassandra
Could you please provide full definition of key and value classes and xml persistence descriptor you are using for Ignite cache? -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-tp7242p7331.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite with Cassandra
Hi, Please properly subscribe to the mailing list so that the community can receive email notifications for your messages. To subscribe, send empty email to user-subscr...@ignite.apache.org and follow simple instructions in the reply. yvladimirov wrote > Hello guys! > Can you help me? > I want used Apache Ignite with Cassandra. > > My Pojo without Ignite. > > @Table(name = "user") > public class User { > @PartitionKey > private UUID id; > private String name; > > } > > It's work > > But if I used Ignite > Caused by: class org.apache.ignite.IgniteException: Failed to > prepare > Cassandra CQL statement: insert into test.user (id, name, id) values > (?,?,?); > at > org.apache.ignite.cache.store.cassandra.session.CassandraSessionImpl.prepareStatement(CassandraSessionImpl.java:515) > at > org.apache.ignite.cache.store.cassandra.session.CassandraSessionImpl.execute(CassandraSessionImpl.java:132) > ... 25 more > > > How to fix duplication ID? What do you use as a key? -Val -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-tp7242p7330.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite with Cassandra and SSL
Hi, This is a duplicate discussion of the following http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-td5610.html#a5700 You will find a solution in the discussion above. -- Denis Good morning Could you please help me understand how to establish persistence to Cassandrea via SSL? What else do I need to ensure apart from setting the below flag to true useSSL false Enables the use of SSL -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5611p5952.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite with Cassandra and SSL
Hi, First of all you need to create an SSLContext object and use it to initialize SSLOptions. Here is a good example [1] provided by Cassandra community. After that pass this SSLOptions object into Ignite's Cassandra DataSource object using DataSource.setSslOptions method. [1] https://github.com/datastax/java-driver/blob/2.1.6/driver-core/src/test/java/com/datastax/driver/core/SSLTestBase.java#L84-L114 <https://github.com/datastax/java-driver/blob/2.1.6/driver-core/src/test/java/com/datastax/driver/core/SSLTestBase.java#L84-L114> — Denis > On Jun 23, 2016, at 6:47 PM, ChickyDutt <ash.dutt.sha...@gmail.com> wrote: > > Attached is my connection-setting.xml file. I have enabled the useSSL > property to true. Can you show me the way to refer the Cassandra node > keystore file and the password through SSLOptions and then include it in the > attached file. > > Please let me know if you need any further information. > > Regards. > Chicky > > On Thu, Jun 23, 2016 at 4:34 PM, Ashish Dutt Sharma <[hidden email] > > wrote: > A gentle reminder. Could you please help me out on this? > > How do I pass the Keystore and the password in SSLOptions in > Cassandra.DataSource? > > Regards, > Ashish Sharma > > On Fri, Jun 17, 2016 at 12:53 AM, Denis Magda [via Apache Ignite Users] > <[hidden email] > > wrote: > Igor R., Alexey K. or Val, > > Is SSL presently supported for Cassandra cache store? > > — > Denis > >> On Jun 13, 2016, at 11:34 AM, ChickyDutt <[hidden email] >> <http://user/SendEmail.jtp?type=node=5700=0>> wrote: >> >> Good morning >> >> Could you please help me understand how to establish persistence to >> Cassandrea via SSL? >> >> What else do I need to ensure apart from setting the below flag to true >> useSSL false Enables the use of SSL >> >> View this message in context: Ignite with Cassandra and SSL >> <http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5611.html> >> Sent from the Apache Ignite Users mailing list archive >> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com >> <http://nabble.com/>. > > > > If you reply to this email, your message will be added to the discussion > below: > http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5610p5700.html > > <http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5610p5700.html> > To start a new topic under Apache Ignite Users, email [hidden email] > > To unsubscribe from Apache Ignite Users, click here > . > NAML > <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > > > connection-settings.xml (2K) Download Attachment > <http://apache-ignite-users.70518.x6.nabble.com/attachment/5847/0/connection-settings.xml> > View this message in context: Re: Ignite with Cassandra and SSL > <http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5610p5847.html> > Sent from the Apache Ignite Users mailing list archive > <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.
Re: Ignite with Cassandra and SSL
Attached is my connection-setting.xml file. I have enabled the useSSL property to true. Can you show me the way to refer the Cassandra node keystore file and the password through SSLOptions and then include it in the attached file. Please let me know if you need any further information. Regards. Chicky On Thu, Jun 23, 2016 at 4:34 PM, Ashish Dutt Sharma < ash.dutt.sha...@gmail.com> wrote: > A gentle reminder. Could you please help me out on this? > > How do I pass the Keystore and the password in SSLOptions in > Cassandra.DataSource? > > Regards, > Ashish Sharma > > On Fri, Jun 17, 2016 at 12:53 AM, Denis Magda [via Apache Ignite Users] < > ml-node+s70518n5700...@n6.nabble.com> wrote: > >> Igor R., Alexey K. or Val, >> >> Is SSL presently supported for Cassandra cache store? >> >> — >> Denis >> >> On Jun 13, 2016, at 11:34 AM, ChickyDutt <[hidden email] >> <http:///user/SendEmail.jtp?type=node=5700=0>> wrote: >> >> Good morning >> >> Could you please help me understand how to establish persistence to >> Cassandrea via SSL? >> >> What else do I need to ensure apart from setting the below flag to true >> *useSSL* false Enables the use of SSL >> >> -- >> View this message in context: Ignite with Cassandra and SSL >> <http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5611.html> >> Sent from the Apache Ignite Users mailing list archive >> <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com >> <http://nabble.com>. >> >> >> >> >> -- >> If you reply to this email, your message will be added to the discussion >> below: >> >> http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5610p5700.html >> To start a new topic under Apache Ignite Users, email >> ml-node+s70518n...@n6.nabble.com >> To unsubscribe from Apache Ignite Users, click here >> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code=1=YXNoLmR1dHQuc2hhcm1hQGdtYWlsLmNvbXwxfC0xOTcwMDkyNjky> >> . >> NAML >> <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >> > > connection-settings.xml (2K) <http://apache-ignite-users.70518.x6.nabble.com/attachment/5847/0/connection-settings.xml> -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5610p5847.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite with Cassandra and SSL
A gentle reminder. Could you please help me out on this? How do I pass the Keystore and the password in SSLOptions in Cassandra.DataSource? Regards, Ashish Sharma On Fri, Jun 17, 2016 at 12:53 AM, Denis Magda [via Apache Ignite Users] < ml-node+s70518n5700...@n6.nabble.com> wrote: > Igor R., Alexey K. or Val, > > Is SSL presently supported for Cassandra cache store? > > — > Denis > > On Jun 13, 2016, at 11:34 AM, ChickyDutt <[hidden email] > <http:///user/SendEmail.jtp?type=node=5700=0>> wrote: > > Good morning > > Could you please help me understand how to establish persistence to > Cassandrea via SSL? > > What else do I need to ensure apart from setting the below flag to true > *useSSL* false Enables the use of SSL > > -- > View this message in context: Ignite with Cassandra and SSL > <http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5611.html> > Sent from the Apache Ignite Users mailing list archive > <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com > <http://nabble.com>. > > > > > -- > If you reply to this email, your message will be added to the discussion > below: > > http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5610p5700.html > To start a new topic under Apache Ignite Users, email > ml-node+s70518n...@n6.nabble.com > To unsubscribe from Apache Ignite Users, click here > <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code=1=YXNoLmR1dHQuc2hhcm1hQGdtYWlsLmNvbXwxfC0xOTcwMDkyNjky> > . > NAML > <http://apache-ignite-users.70518.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5610p5843.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Ignite with Cassandra and SSL
Igor R., Alexey K. or Val, Is SSL presently supported for Cassandra cache store? — Denis > On Jun 13, 2016, at 11:34 AM, ChickyDutt <ash.dutt.sha...@gmail.com> wrote: > > Good morning > > Could you please help me understand how to establish persistence to > Cassandrea via SSL? > > What else do I need to ensure apart from setting the below flag to true > useSSLfalse Enables the use of SSL > > View this message in context: Ignite with Cassandra and SSL > <http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5611.html> > Sent from the Apache Ignite Users mailing list archive > <http://apache-ignite-users.70518.x6.nabble.com/> at Nabble.com.
Ignite with Cassandra and SSL
Good morning Could you please help me understand how to establish persistence to Cassandrea via SSL? What else do I need to ensure apart from setting the below flag to true *useSSL* false Enables the use of SSL -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-with-Cassandra-and-SSL-tp5611.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.