Denis, thanks for the detailed response. A few more follow up questions 1) Are indexs loaded into heap (when used)? 2) Are full pages loaded into heap, or only the matching records? 3) When the query needs more processing than the exisiting index (non-indexed columns, groupBy, aggreag) where/how does it happen? 4) How is the query coordinator chosen? Is it the client node? How about when using the web console? 5) What paralalism settings would your recomend, we were thinking to set parallelJobsNumber to 1 and task parallelism to number of cores * 2 - this way we can make sure that each job gets al the heap memory instead of all jobs fighting each other. Not sure if it makes sense, and it will also prevent us from making real time transactional transactional queries.(we are hoping to use ignite for both olap and simple real time queries)
Cheers, Eugene On Sat, Aug 25, 2018 at 3:25 AM Denis Magda <[email protected]> wrote: > Hello Eugene, > > 1) In what format is data stored off heap? > > > Data is always stored in the binary format let it be on-heap, off-heap or > Ignite persistence. > https://apacheignite.readme.io/docs/binary-marshaller > > 2) What happens when a SQL query is executed, in particular > >> >> - How is H2 used? How is data loaded in H2? What if some of the data >> is on disk? >> >> H2 is used to build execution plans for SELECTs. H2 calls Ignite's B+Tree > based indexing implementation to see which indexes are set. All the data > and indexes are always stored in Ignite (off-heap + disk). > >> >> - When is data loaded into heap, and how much? Is only the output of >> H2 loaded, or everything? >> >> Queries results are stored in Java heap temporarily. Once the result set > is read by your application, it will be garbage collected. > >> >> - How is the reduce stage performed? Is it performed only on one node >> (hence that node needs to load all the data into memory) >> >> Correct, the final result set is reduced on a query coordinator - your > application that executed a SELECT. > > 3) What happens when Ingite runs out of memory during execution? Is data >> evictied to disk (if persistence is enabled)? > > > I guess you mean what happens if a result set doesn't fit in RAM during > the execution, right? If so, then OOM will occur. We're working on an > improvement that will offload the result set to disk to avoid OOM for all > the scenarious: > https://issues.apache.org/jira/browse/IGNITE-7526 > > > >> 4) Based on the code, it looks like I need to set my data region size to >> at most 50% of available memory (to avoid the warning), this seems a bit >> wastefull. > > > There is no such a requirement. I know many deployments use cases when one > data region is given 20% of RAM, the other is given 40% and everything else > is persisted to disk. > > 5) Do you have any general advice on benchmarking the memory requirpement? >> So far I have not been able to find a way to check how much memory each >> table takes on and off heap, and how much memory each query takes. > > > We use Yardstick for performance benchmarking: > https://apacheignite.readme.io/docs/perfomance-benchmarking > > -- > Denis > > On Fri, Aug 24, 2018 at 7:06 AM eugene miretsky <[email protected]> > wrote: > >> Thanks! >> >> I am trying to understand when and how data is moved from off-heap to on >> heap, particularly when using SQL. I took a look at the wiki >> <https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood> >> but >> still have a few questions >> >> My understanding is that data is always store off-heap >> >> 1) In what format is data stored off heap? >> 2) What happens when a SQL query is executed, in particular >> >> - How is H2 used? How is data loaded in H2? What if some of the data >> is on disk? >> - When is data loaded into heap, and how much? Is only the output of >> H2 loaded, or everything? >> - How is the reduce stage performed? Is it performed only on one node >> (hence that node needs to load all the data into memory) >> >> 3) What happens when Ingite runs out of memory during execution? Is data >> evictied to disk (if persistence is enabled)? >> 4) Based on the code, it looks like I need to set my data region size to >> at most 50% of available memory (to avoid the warning), this seems a bit >> wastefull. >> 5) Do you have any general advice on benchmarking the memory >> requirpement? So far I have not been able to find a way to check how much >> memory each table takes on and off heap, and how much memory each query >> takes. >> >> Cheers, >> Eugene >> >> On Fri, Aug 24, 2018 at 8:06 AM, NSAmelchev <[email protected]> wrote: >> >>> Hi Eugene, >>> >>> Yes, it's a misprint as Dmitry wrote. >>> >>> Ignite print this warning if nodes on local machine require more than >>> 80% of >>> physical RAM. >>> >>> From code, you can see that total heap/offheap memory summing >>> from nodes having the same mac address. This way calculates total memory >>> used >>> by the local machine. >>> >>> -- >>> Best wishes, >>> Amelchev Nikita >>> >>> >>> >>> -- >>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >>> >> >>
