Here is the ticket:


On Wed, Oct 12, 2016 at 6:45 PM, Igor Rudyak <> wrote:

> Hi Val,
> I don't have any objections - please create a ticket and link it to the
> root ticket
> Igor
> On Wed, Oct 12, 2016 at 4:10 PM, Valentin Kulichenko <
>> wrote:
>> Hi Igor,
>> 1. I still think we should do this. Loading nothing is very
>> counterintuitive and prevents a newbie user from quick start. For large
>> tables, when only part of the dataset is needed, user will explicitly
>> specify the query, of course. Do you have objections? If no, I will create
>> a ticket.
>> 2. Got it, thanks.
>> -Val
>> On Mon, Oct 10, 2016 at 12:12 AM, Igor Rudyak <> wrote:
>>> Hi Val,
>>> 1) Well, it's not a problem to implement such default behavior, but
>>> there is one concern. In most cases, when you are using Cassandra as a
>>> persistent store you are going to store large amount of data, which is
>>> significantly bigger that amount of RAM in your Ignite cluster. In the such
>>> case it doesn't make sense to launch CQL query like "select * from
>>> my_table" cause:
>>>    a) You still will not be able to keep all data from Cassandra table
>>> in Ignite cache
>>>    b) All the data will be pulled from Cassandra table using only one
>>> thread - which is very slow
>>> 2) Unfortunately it's not possible in Cassandra. For JDBC you are
>>> splitting table into chunks of 512 rows each, using sub-queries and
>>> ordering by primary keys. Such kind of things are not supported in
>>> Cassandra. Probably the only way to load data from Cassandra table in
>>> parallel, is to load it from some specified partitions (in parallel for
>>> each partition).
>>> Igor Rudyak
>>> On Fri, Oct 7, 2016 at 1:45 PM, Valentin Kulichenko <
>>>> wrote:
>>>> Hi Igor,
>>>> Thanks for response!
>>>> 1. It's a bit inconsistent with other store implementations we have in
>>>> the product and actually I find this counterintuitive. Why don't we just
>>>> load all the data available in the table? Explicit query is useful when you
>>>> want to customize this and load subset of data based on some criteria. If
>>>> this is not possible for some reason, then I would at least throw an
>>>> exception in case query is not specified.
>>>> 2. Is it possible to automatically split the data in bulks and load
>>>> them in parallel? We do this in the JDBC store, for example.
>>>> -Val
>>>> On Thu, Oct 6, 2016 at 11:00 PM, Igor Rudyak <> wrote:
>>>>> Hi Val,
>>>>> 1) If you'll call loadCache(null) it will do nothing. You need to
>>>>> provide at least one CQL query.
>>>>> 2) It depends. If you'll provide more than one CQL query, it will use
>>>>> separate thread for each of the queries (max number of threads limited to
>>>>> the number of CPU cores). But for each provided CQL query it will use only
>>>>> one thread to load all the data returned by the query. Also it will run 
>>>>> the
>>>>> same CQL query from ALL Ignite nodes to load the same data, which is bad.
>>>>> That's because loadCache method will be executed on each Ignite node. As
>>>>> you see, it's not very efficient way to load data from Cassandra just by
>>>>> specifying CQL query. The ticket I created, is all about how to load data
>>>>> from one table (or from multiple tables as well) in parallel by
>>>>> partitioning it. Such a way each Ignite node will be responsible to load
>>>>> data from the specific partition range of Cassandra table, which is much
>>>>> more efficient. To support such kind of cache warm-up you should design
>>>>> your Cassandra table specific way - there should be some mapping from
>>>>> Ignite partition to the set of Cassandra partitions. Yes I have plans to
>>>>> implement this.
>>>>> Igor Rudyak
>>>>> On Thu, Oct 6, 2016 at 10:19 AM, Valentin Kulichenko <
>>>>>> wrote:
>>>>>> Hi Igor,
>>>>>> I've got couple of quick questions about the Cassandra store.
>>>>>>    1. In [1] you suggested to provide an explicit query as a
>>>>>>    parameter for loadCache() method, because otherwise user was always 
>>>>>> getting
>>>>>>    empty result. Is this a requirement to provide the query? What if I 
>>>>>> just
>>>>>>    call loadCache(null)?
>>>>>>    2. There is a ticket [2] about parallel load in Cassandra store.
>>>>>>    Does it mean that currently it loads only in a single threaded 
>>>>>> fashion? If
>>>>>>    so, do you have any plans to implement this improvement?
>>>>>> [1]
>>>>>> ery-on-a-cache-using-Cassandra-as-a-persistent-store-td7870.html
>>>>>> [2]
>>>>>> Thanks,
>>>>>> Val

Reply via email to