Hello Ray,
Have you tried to rewrite ingestion using cache API, DML or better yet
DataStreamer? Because it's non-trivial to reason about problem when extra
Spark layer is added.
Why do you have such non-trivial amount of fields (and indices) in cache key
object? Maybe try synthetic keys?
Hi Ray
With regards to question about eviction:
yes, eviction is started only if memory region is full.
If eviction started then the warning message is logged. Since we did not
find this message, we can assume region increase will not speed up loading
for now.
Was the problem solved? Or is it
Hi Ray,
I checked the dumps and from them it is clear that the client node can not
provide more data load, since the cluster is already busy. But there is no
activity on the server node.
I suppose that the problem could be located on some other server node of
the three remaining. The logs you
Hi Dmitriy,
Thanks for the reply.
I know the eviction is automatic, but does eviction happen only when the
memory is full?
>From the log, I didn't see any "Page evictions started, this will affect
storage performance" message.
So my guess is that memory is not fully used up and no eviction
Hi Ray,
I plan to look again to the dumps.
I can note that evictions of records from memory to disk does not need to
be additionally configured, it works automatically.
Therefore, yes, increasing volume of RAM for data region will allow to
evict records from memory less often.
Sincerely,
Hi Dmitriy,
Thanks for the answers.
The cluster is stable during the data ingestion, no node joining or leaving
happened.
I've been monitoring the cluster's topology and cache entries numbers from
visor the whole time.
I'm also confused why rebalancing is triggered, from visor I can see that
Hi Ray,
Thank you for thread dumps. 'Failed to wait for partition map exchange' is
related to rebalancing. What can be reasons of rebalancing, is it possible
some nodes were joining or left topology? Data load itself can't cause
rebalancing, partitions are not moved if cluster is stable.
If
Hi Dmitriy,
I'll try setting those two parameters you mentioned, but I doubt it will
make a difference.
As I mentioned in the reply to Alexey, I find out that when the ingestion
speed is slowed down, ignite spends a lot of time on balancing.
Here's the thread dump for client and server.
The
After reviewing the log, I don't see any logs with "Page evictions started,
this will affect storage performance" message.
But I find out that when the ingestion speed is slowed down, ignite spends a
lot of time on balancing.
I posted the log file earlier, you can check the archive file for logs
sorry for misprint. I meant thread dumps, of course
вт, 17 окт. 2017 г. в 18:16, Dmitry Pavlov :
> Hi Ray,
>
> Thank you for your reply. In addition to checkpoint marker, setting page
> size to 4K (already default in newer versions of Ignite) and WAL history
> size to
Hi Ray,
Thank you for your reply. In addition to checkpoint marker, setting page
size to 4K (already default in newer versions of Ignite) and WAL history
size to value “1” may help to reduce overhead and space used and make
loading a little faster.
I apologize if you already mentioned this, but
Hi Ray,
Do you see "Page evictions started, this will affect storage performance"
message in the log? If so, dramatic performance drop you observe might
indicate that we have an issue with page replacement algorithm that we need
to investigate. Can you please check the message?
2017-10-17 17:09
I'm using ignite 2.1.
The phenomenon I observed is that for the first 130M entries the speed is
OK, but after about 130M entries it slowed down tremendously and finally it
will stuck.
The problem is that when I ingest small amount of data like 20M, it works OK
and the performance is acceptable.
Hi Ray,
I’m also trying to reproduce this behaviour, but for 20M of entries it
works fine for ignite 2.2.
It is expected that in-memory only mode works faster, because memory has a
higher write speed by several orders of magnitude than the disc.
Which type of disc is installed in servers? Is it
The above log is captured when the data ingestion slowed down, not stuck
completely.
The job has been running two and a half hour now, and the total records to
be ingested is 550 million.
During last ten minutes, less than one million records has been ingested
into Ignite.
The performance for
Hello Ray!
Can you please share cache configuration also? There's nothing in your
configuation which stands out so maybe I'll try to reproduce it on hardware.
Did checkpointing tuning produce any measurable difference? Do you spot
anything in Ignite logs when nodes get stuck that you may share?
My Ignite config is as follows
Hello Ray,
Can you please share your Ignite configuration? That including persistence
settings and cache configuration and memory policy.
Does it get stuck for good, or does it get un-stuck periodically to do some
loading?
This might be related to checkpointing. Try setting
I had the same problem in this thread
http://apache-ignite-users.70518.x6.nabble.com/Performance-of-persistent-store-too-low-when-bulb-loading-td16247.html
Basically, I'm using ignite-spark's saveParis method(which is also a
IgniteDataStreamer) to ingest 550 million entries of data into Ignite,
19 matches
Mail list logo