Thin client vs client node performance in Spark

2018-08-14 Thread eugene miretsky
Hello, What are the tradeoffs of using the thin client vs client node? Are there any benchmarks? The Spark client is using the latter (client node) - is that for performance reasons or just legacy? Cheers, Eugene

Source code of latest benchmarks

2018-08-16 Thread eugene miretsky
Hi, I am trying to write my own benchmark. The Ignite benchmarks github linked on the website is very old, is there a newer version I could work off? Cheers, Eugene

Confusion/inaccurate visor stats

2018-08-17 Thread eugene miretsky
Hello, I am running a single Ignite node on a r4.8xlarge EC2 node. I am using the default settings with 132G allocated for the default memory region. So far I have uploaded 1 large table 60M rows using Spark The output of node and cache commands is pasted bellow. A few questions 1) In Data

Data modeling for segmenting a huge data set: precomputing vs real time computations

2018-08-21 Thread eugene miretsky
Hello, We have a very big data set (200M+ users), for each user we store their activity (views, sales, etc.) and we need to be able to segment the users based on that data. One (very simplified) use case looks something like: - Data: customer_id, date, category, sub_category, action - Query: All

Slow SQL query uses only a single CPU

2018-08-21 Thread eugene miretsky
Hi, We have a cache called GAL3EC1, it has 1. A composite pKey consisting of customer_id and date 2. An Index on the date column 3. 300 sparse columns We are running a single EC2 4x8xlarge node. The following query takes 8min to finish Select COUNT (*) FROM (SELECT customer_id FROM

Re: Slow SQL query uses only a single CPU

2018-08-21 Thread eugene miretsky
-05-12' GROUP BY __Z0.CUSTOMER_ID SELECT COUNT(*) FROM ( SELECT __C0_0 AS CUSTOMER_ID FROM PUBLIC.__T0 GROUP BY __C0_0 HAVING (SUM(__C0_1) > 2) AND (MAX(__C0_2) < 1) ) _0__Z1 On Tue, Aug 21, 2018 at 8:18 PM, eugene miretsky wrote: > Hi, > > We have a cache called GAL3EC1, it has >

Recommended HW on AWS EC2 - vertical vs horizontal scaling

2018-08-21 Thread eugene miretsky
Hello, We are looking to set up a fairly large (a few TB) cluster in AWS for OLAP and transactional use cases. 1. Are there any Ignite benchmarks on horizontally vs vertical scaling? 2. What EC2 instances are other people using in prod? (I am assuming that one of the memory optimized

Re: Recommended HW on AWS EC2 - vertical vs horizontal scaling

2018-08-24 Thread eugene miretsky
Thanks Andrei, For user case, please see my email ("Data modeling for segmenting a huge data set: precomputing vs real time computations"). I think our main confusion right now is trying to understand how exactly SQL queries work (when memory is moved to heap, when/how is H2 used, how the reduce

Re: Thin client vs client node performance in Spark

2018-08-24 Thread eugene miretsky
Thanks, So the way I understand it, thick client will use the affinitly key to send data to the right node, and hence will split the traiffic between all the nodes, the thin client will just send the data to one node, and that node will be responsible to send it to the actual node that owns the

Re: How much heap to allocate

2018-08-24 Thread eugene miretsky
Thanks! I am trying to understand when and how data is moved from off-heap to on heap, particularly when using SQL. I took a look at the wiki but still have a few questions My understanding is that data

Re: Thin client vs client node performance in Spark

2018-08-24 Thread eugene miretsky
) at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109) at org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onMessageReceived(GridNioServer.java:3490) On Fri, Aug 24, 2018 at 11:18 AM, eugene miretsky wrote: > Thanks, >

Spark Dataframe write is hanging

2018-08-21 Thread eugene miretsky
Hi, When I am saving Spark Dataframe to Ignite, the job sometimes get stuck with the attached error. It seems to happen at random, and usualy at the end of the job (all the data has already been writen to Ignite) Has anybody encountered this before? java.net.ConnectException: Connection

Re: How much heap to allocate

2018-08-27 Thread eugene miretsky
else > is persisted to disk. > > 5) Do you have any general advice on benchmarking the memory requirpement? >> So far I have not been able to find a way to check how much memory each >> table takes on and off heap, and how much memory each query takes. > > > We use Yardsti

Re: Slow SQL query uses only a single CPU

2018-08-22 Thread eugene miretsky
oc/org/ > apache/ignite/cache/query/SqlFieldsQuery.html#setCollocated-boolean- > [2] https://apacheignite.readme.io/v2.0/docs/sql-performance- > and-debugging#index-hints > > > On Wed, Aug 22, 2018 at 3:10 PM eugene miretsky > wrote: > >> Thanks Andrey, >> >> Right now we are

Re: Slow SQL query uses only a single CPU

2018-08-22 Thread eugene miretsky
t; 2) AND (MAX(__C0_2) < 1) ) _0__Z1 On Wed, Aug 22, 2018 at 9:43 AM, eugene miretsky wrote: > Thanks Andrey, > > We are using the Ignite notebook, any idea if there is a way to provide > these flags and hints directly from SQL? > > From your description, it seems like th

Re: Slow SQL query uses only a single CPU

2018-08-22 Thread eugene miretsky
d on map phase and on reduce. > Map query process node local data (until distributed joins on), while > reduce fetch data from remote node that may costs. . > > > On Wed, Aug 22, 2018 at 6:07 AM eugene miretsky > wrote: > >> Here is the res

Re: Slow SQL query uses only a single CPU

2018-08-22 Thread eugene miretsky
t;> >> MAX(__Z0.RU_TOTAL_WEB_SESSIONS_COUNT) AS __C0_2 >> >> FROM PUBLIC.GAL2RU __Z0 >> >> /* PUBLIC.DT_IDX2: DT > '2018-06-12' */ >> >> WHERE __Z0.DT > '2018-06-12' >> >> GROUP BY __Z0.CUSTOMER_ID >> >> >> SELECT >>

How much heap to allocate

2018-08-22 Thread eugene miretsky
Hi, I am getting the following warning when starting Ignite - " Nodes started on local machine require more than 20% of physical RAM what can lead to significant slowdown due to swapping " The 20% is a typo in version 2.5, it should be 80%. We have increased the max size of the default region

Configurations precedence and consistency across the cluster

2018-08-21 Thread eugene miretsky
Hello, It looks like there are several ways to set cluster configuration (I am currently looking at CacheConfiguration in particualr) 1. Via the configuration XML file provided at startup 2. Via the Java client (CacheConfiguration object) 3. Via SQL commands I have a few questions

Node keeps crashing under load

2018-08-30 Thread eugene miretsky
Hello, I have a medium cluster set up for testings - 3 x r4.8xlarge EC2 nodes. It has persistence enabled, and zero backup. - Full configs are attached. - JVM settings are: JVM_OPTS="-Xms16g -Xmx64g -server -XX:+AggressiveOpts -XX:MaxMetaspaceSize=256m -XX:+AlwaysPreTouch -XX:+UseG1GC

Query 3x slower with index

2018-09-01 Thread eugene miretsky
Hello, Schema: - PUBLIC.GATABLE2.CUSTOMER_ID PUBLIC.GATABLE2.DT PUBLIC.GATABLE2.CATEGORY_ID PUBLIC.GATABLE2.VERTICAL_ID PUBLIC.GATABLE2.SERVICE PUBLIC.GATABLE2.PRODUCT_VIEWS_APP PUBLIC.GATABLE2.PRODUCT_CLICKS_APP PUBLIC.GATABLE2.PRODUCT_VIEWS_WEB

Re: ignte cluster hang with GridCachePartitionExchangeManager

2018-09-07 Thread eugene miretsky
Hi Wangsan, So what was the original cause of the issue? Was it blocking the listening thread in your test code or something else? We are having similar issues Cheers, Eugene On Mon, Sep 3, 2018 at 1:23 PM Ilya Kasnacheev wrote: > Hello! > > The operation will execute after partition map

How to set node Id?

2018-08-30 Thread eugene miretsky
Hello, Is it possible to set a nodeId when restarting a node? How is the id generated? Sometimes after the cluster crashes, when I restart a node I get the following error: Caused by: class org.apache.ignite.spi.IgniteSpiException: Failed to add node to topology because it has the same hash code

Re: How to set node Id?

2018-08-31 Thread eugene miretsky
isions before. Are you using > Ignite persistence and what's your version? If you scroll to the end of > this paragraph, you'll find an explanation on how the IDs are generated: > > https://apacheignite.readme.io/docs/distributed-persistent-store#section-usage > > -- > Denis > >

Partition map exchange in detail

2018-09-07 Thread eugene miretsky
Hello, Out cluster occasionally fails with "partition map exchange failure" errors, I have searched around and it seems that a lot of people have had a similar issue in the past. My high-level understanding is that when one of the nodes fails (out of memory, exception, GC etc.) nodes fail to

Re: Partition map exchange in detail

2018-09-07 Thread eugene miretsky
common reason of PME hang-up is > pending cache operation that couldn't finish. Check your logs - it should > list pending transactions and atomic updates. Search for "Found long > running" substring. > > Hope this helps. > > On Fri, Sep 7, 2018 at 11:45 PM, eugene miretsk

Re: Node keeps crashing under load

2018-09-11 Thread eugene miretsky
y to specify your external address (such as 172.21.85.213) with > TcpCommunicationSpi.setLocalAddress() on each node. > > Regards, > -- > Ilya Kasnacheev > > > пт, 7 сент. 2018 г. в 20:01, eugene miretsky : > >> Hi all, >> >> Can somebody please provide some pointers o

Re: Role of H2 datbase in Apache Ignite

2018-10-11 Thread eugene miretsky
although there is a lot of > work done by Ignite to make the distributed > > map-reduce work like creating temporary tables for intermediate results. > > > > Stan > > > > *From: *eugene miretsky > *Sent: *9 октября 2018 г. 21:52 > *To: *user@ignite.apach

Re: Query 3x slower with index

2018-10-11 Thread eugene miretsky
can’t be used instead of it. > > On the other hand, (customer_id, category_id, dt) can - the last part of > the index will be left unused. > > > > Thanks, > > Stan > > > > *From: *eugene miretsky > *Sent: *9 октября 2018 г. 19:40 > *To: *user@ignite.apach

Re: Query 3x slower with index

2018-10-09 Thread eugene miretsky
t; will perform better or worse on real data. That's why I need a subset of > data which will make query execution speed readily visible. Unfortunately, > I can't deduce that from query plan alone. > > Regards, > -- > Ilya Kasnacheev > > > пн, 24 сент

Re: Role of H2 datbase in Apache Ignite

2018-10-09 Thread eugene miretsky
Hello, I have been struggling with this question myself for a while now too. I think the documents are very ambiguous on how exactly H2 is being used. The document that you linked say "Apache Ignite leverages from H2's SQL query parser and optimizer as well as the execution planner. Lastly, *H2

Re: How much heap to allocate

2018-08-30 Thread eugene miretsky
us from making real time transactional transactional queries.(we >> are hoping to use ignite for both olap and simple real time queries) > > > I would start a separate discussion for this bringing this question to the > attention of our SQL experts. I'm not the one of them. > > --

Re: Query 3x slower with index

2018-09-20 Thread eugene miretsky
_id second? > Note that Ignite will use only one index when joining two tables and that > in your case it should start with category_id. > > You can also try adding affinity key to this index in various places, see > if it helps further. > > Regards, > -- > Ilya Kasnacheev > &

Re: Query 3x slower with index

2018-09-24 Thread eugene miretsky
cheev wrote: > Hello! > > Can you share a reproducer project which loads (or generates) data for > caches and then queries them? I could try and debug it if I had the > reproducer. > > Regards. > -- > Ilya Kasnacheev > > > чт, 20 сент. 2018 г. в 21:05, eugene miretsk

Ignite + Spark: json4s versions are incompatible

2018-09-26 Thread eugene miretsky
Hello, Spark provides json4s 3.2.X, while Ignite uses the newest version. This seems to cause an error when using some spark SQL commands that use a json4s methods that no longer exist. Adding Ignite to our existing Spark code bases seems to break things. How do people work around this issue?

Re: Query 3x slower with index

2018-09-19 Thread eugene miretsky
e _key_PK as index. If your primary key is > composite, it won't work properly for you. I recommend creating an explicit > (category_id, customer_id) index. > > Regards, > -- > Ilya Kasnacheev > > > вт, 18 сент. 2018 г. в 17:47, eugene miretsky : > >> Hi Ilya, >>

Re: Partition map exchange in detail

2018-09-12 Thread eugene miretsky
t; Discovery. Could you please re-phrase it? > > ср, 12 сент. 2018 г. в 17:54, Ilya Lantukh : > >> Pavel K., can you please answer about Zookeeper discovery? >> >> On Wed, Sep 12, 2018 at 5:49 PM, eugene miretsky < >> eugene.miret...@gmail.com> wrote: >> &g

IGNITE-8386 question (composite pKeys)

2018-09-12 Thread eugene miretsky
Hi, A question regarding https://issues.apache.org/jira/browse/IGNITE-8386?focusedCommentId=16511394=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16511394 It states that a pkey index with a compoise pKey is "effectively useless". Could you please explain why is

Re: Node keeps crashing under load

2018-09-12 Thread eugene miretsky
> I think it's not the first time I have seen this problem but I have > positively no idea how to tackle it. > Maybe Docker experts could chime in? > > Regards, > -- > Ilya Kasnacheev > > > ср, 12 сент. 2018 г. в 3:29, eugene miretsky : > >> Thanks Ilya, >&g

Re: Partition map exchange in detail

2018-09-12 Thread eugene miretsky
that case and the cluster will continue to live. > > > ср, 12 сент. 2018 г. в 18:53, eugene miretsky : > >> Hi Pavel, >> >> The issue we are discussing is PME failing because one node cannot >> communicate to another node, that's what IEP-25 is trying to solve. But

Re: How much heap to allocate

2018-09-12 Thread eugene miretsky
ess how much space it > would take in heap, but I think it would be ~50-100 bytes per customer_id. > So if you have N customers, it would be (100 * N) bytes > 3) Please see > https://apacheignite-sql.readme.io/docs/performance-and-debugging > > Vladimir. > > On Thu, Aug 30, 201

Re: Query 3x slower with index

2018-09-12 Thread eugene miretsky
duce phase as it use sorted index lookup, >> while second query should process full dataset on map phase before pass >> it for reducing. >> >> Try to use composite index (customer_id, category_id). >> >> Also, SqlQueryFields.setCollocated(true) flag can help Ignite

Re: Failed to wait for initial partition map exchange

2018-09-12 Thread eugene miretsky
Do you have persistence enabled? On Wed, Sep 12, 2018 at 6:31 PM ndipiazza3565 < nicholas.dipia...@lucidworks.com> wrote: > I'm trying to build up a list of possible causes for this issue. > > I'm only really interested in the issues that occur after successful > production deployments. Meaning

Re: Partition map exchange in detail

2018-09-12 Thread eugene miretsky
elated problem. > 4) How can you ensure that partition maps on coordinator are *latest *without > "freezing" cluster state for some time? > > On Sat, Sep 8, 2018 at 3:21 AM, eugene miretsky > wrote: > >> Thanks! >> >> We are using persistence, so I am n

SQL query and Indexes architecture

2018-09-14 Thread eugene miretsky
Hello, Trying to understand how exactly SQL queries are executed in Ignite. A few questions 1. To what extent is H2 used? Does it store the data? Does it create the indexes? Is it used only for generating execution plans? I believe that all the data used to be stored in H2, but with

Backup failover with persistence

2018-09-14 Thread eugene miretsky
What is the process when a node goes down and then restarts? Say backups = 1. We have node A that is primary for some key, and node B that is back up. Node A goes down and then restarts after 5 min. What are the steps? 1) Node A is servicing all traffic for key X 2) Node A goes down 3) Node B

Handling split brain with Zookeeper and persistence

2018-09-14 Thread eugene miretsky
Hi, What are best practices for handling split brain with persistence? 1) Does Zookeeper split brain resolver consider all nodes as the same (client, memory only, persistent). Ideally, we want to shut down persistent nodes only as last resort. 2) If a persistent node is shut down, we need to

Re: Configurations precedence and consistency across the cluster

2018-09-14 Thread eugene miretsky
Thanks for the response! A few more follow up questions: 1) How can we chance configurations of persistent caches (replication and recovery settings for example)? 2) For client related settings, are the settings taken from the server config, or client config (partitionLosePolicy or SQL table

Re: Network Segmentation

2018-09-14 Thread eugene miretsky
What does it provide on top of the Zookeeper split brain resolver? https://apacheignite.readme.io/docs/zookeeper-discovery On Fri, Sep 14, 2018 at 3:59 AM Kopilov wrote: > luqmanahmad, what does this plugin definitely do? > Can it help to avoid segmentation or only to detect them? > Can it be

Re: Partition map exchange in detail

2018-09-12 Thread eugene miretsky
> here (in general case). However, it is true that PME could be simplified or > completely avoid for certain cases and the community is currently working > on such optimizations (https://issues.apache.org/jira/browse/IGNITE-9558 > for example). > > On Wed, Sep 12, 2018 at 9:08

Re: SQL query and Indexes architecture

2018-09-17 Thread eugene miretsky
es (?). > 2. Maybe you're right, I have to admit I'm unfamiliar with precise details > here. > > Regards, > -- > Ilya Kasnacheev > > > пн, 17 сент. 2018 г. в 16:02, eugene miretsky : > >> Thanks! >> >> >>1. >>1. "Ignite feeds H2

Re: Backup failover with persistence

2018-09-17 Thread eugene miretsky
ards, > -- > Ilya Kasnacheev > > > пт, 14 сент. 2018 г. в 22:23, eugene miretsky : > >> What is the process when a node goes down and then restarts? >> >> Say backups = 1. We have node A that is primary for some key, and node B >> that is back up. >&

Re: Backup failover with persistence

2018-09-17 Thread eugene miretsky
> Ilya Kasnacheev > > > пн, 17 сент. 2018 г. в 15:45, eugene miretsky : > >> How is "finish syncing" defined? Since it is a distributed system that is >> no way to guarantee that node A is 100% caught up to node B. In Kafka there >> is a replica.lag.time.ma

Re: Query 3x slower with index

2018-09-17 Thread eugene miretsky
Hello, Just wanted to see if anybody had time to look into this. Cheers, Eugene On Wed, Sep 12, 2018 at 6:29 PM eugene miretsky wrote: > Thanks! > > Tried joining with an inlined table instead of IN as per the second > suggestion, and it didn't quite work. > > Query1: >

Re: SQL query and Indexes architecture

2018-09-17 Thread eugene miretsky
access indexed fields. > 4. With GROUP BY, lazy evaluation will not help you much. It will still > have to hold all data on heap at some point. Lazy evaluation mostly helps > with "SELECT * FROM table" type queries which provide very large and boring > result set. > &

Re: Configurations precedence and consistency across the cluster

2018-09-18 Thread eugene miretsky
Thanks! A few clarifications: 1) The first configuration with given cache name will be applied to all nodes" - what do you mean by the first configuration? The configuration of the first node that was started? Is there a gossip/consensus protocol that syncs the cache configs across the 2) We are

Re: How much heap to allocate

2018-09-18 Thread eugene miretsky
My understanding is that lazy loading doesn't work with group_by. On Tue, Sep 18, 2018 at 10:11 AM Mikhail wrote: > Hi Eugene, > > >For #2: wouldn't H2 need to bring the data into the heap to make the > queries? > > Or at least some of the date to do the group_by and sum operation? > > yes,

Re: Query 3x slower with index

2018-09-18 Thread eugene miretsky
appreciate if you could take a look. Cheers, Eugene On Mon, Sep 17, 2018 at 9:15 AM Ilya Kasnacheev wrote: > Hello! > > Why don't you diff the results of those two queries, tell us what the > difference is? > > Regards, > -- > Ilya Kasnacheev > > > пн, 17 сент.

Re: IGNITE-8386 question (composite pKeys)

2018-09-18 Thread eugene miretsky
nd will behave the way you expect (e.g. it will be used instead of > the affinity key index). > > > > Stan > > > > *From: *eugene miretsky > *Sent: *12 сентября 2018 г. 23:45 > *To: *user@ignite.apache.org > *Subject: *IGNITE-8386 question (composite