Ok, thanks. Igor
On Oct 13, 2016 4:37 PM, "Valentin Kulichenko" < valentin.kuliche...@gmail.com> wrote: > Here is the ticket: https://issues.apache.org/jira/browse/IGNITE-4075 > > -Val > > On Wed, Oct 12, 2016 at 6:45 PM, Igor Rudyak <irud...@gmail.com> wrote: > >> Hi Val, >> >> I don't have any objections - please create a ticket and link it to the >> root ticket https://issues.apache.org/jira/browse/IGNITE-1371 >> >> Igor >> >> On Wed, Oct 12, 2016 at 4:10 PM, Valentin Kulichenko < >> valentin.kuliche...@gmail.com> wrote: >> >>> Hi Igor, >>> >>> 1. I still think we should do this. Loading nothing is very >>> counterintuitive and prevents a newbie user from quick start. For large >>> tables, when only part of the dataset is needed, user will explicitly >>> specify the query, of course. Do you have objections? If no, I will create >>> a ticket. >>> >>> 2. Got it, thanks. >>> >>> -Val >>> >>> On Mon, Oct 10, 2016 at 12:12 AM, Igor Rudyak <irud...@gmail.com> wrote: >>> >>>> Hi Val, >>>> >>>> 1) Well, it's not a problem to implement such default behavior, but >>>> there is one concern. In most cases, when you are using Cassandra as a >>>> persistent store you are going to store large amount of data, which is >>>> significantly bigger that amount of RAM in your Ignite cluster. In the such >>>> case it doesn't make sense to launch CQL query like "select * from >>>> my_table" cause: >>>> a) You still will not be able to keep all data from Cassandra table >>>> in Ignite cache >>>> b) All the data will be pulled from Cassandra table using only one >>>> thread - which is very slow >>>> >>>> 2) Unfortunately it's not possible in Cassandra. For JDBC you are >>>> splitting table into chunks of 512 rows each, using sub-queries and >>>> ordering by primary keys. Such kind of things are not supported in >>>> Cassandra. Probably the only way to load data from Cassandra table in >>>> parallel, is to load it from some specified partitions (in parallel for >>>> each partition). >>>> >>>> >>>> Igor Rudyak >>>> >>>> On Fri, Oct 7, 2016 at 1:45 PM, Valentin Kulichenko < >>>> valentin.kuliche...@gmail.com> wrote: >>>> >>>>> Hi Igor, >>>>> >>>>> Thanks for response! >>>>> >>>>> 1. It's a bit inconsistent with other store implementations we have in >>>>> the product and actually I find this counterintuitive. Why don't we just >>>>> load all the data available in the table? Explicit query is useful when >>>>> you >>>>> want to customize this and load subset of data based on some criteria. If >>>>> this is not possible for some reason, then I would at least throw an >>>>> exception in case query is not specified. >>>>> >>>>> 2. Is it possible to automatically split the data in bulks and load >>>>> them in parallel? We do this in the JDBC store, for example. >>>>> >>>>> -Val >>>>> >>>>> On Thu, Oct 6, 2016 at 11:00 PM, Igor Rudyak <irud...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi Val, >>>>>> >>>>>> 1) If you'll call loadCache(null) it will do nothing. You need to >>>>>> provide at least one CQL query. >>>>>> >>>>>> 2) It depends. If you'll provide more than one CQL query, it will use >>>>>> separate thread for each of the queries (max number of threads limited to >>>>>> the number of CPU cores). But for each provided CQL query it will use >>>>>> only >>>>>> one thread to load all the data returned by the query. Also it will run >>>>>> the >>>>>> same CQL query from ALL Ignite nodes to load the same data, which is bad. >>>>>> That's because loadCache method will be executed on each Ignite node. As >>>>>> you see, it's not very efficient way to load data from Cassandra just by >>>>>> specifying CQL query. The ticket I created, is all about how to load data >>>>>> from one table (or from multiple tables as well) in parallel by >>>>>> partitioning it. Such a way each Ignite node will be responsible to load >>>>>> data from the specific partition range of Cassandra table, which is much >>>>>> more efficient. To support such kind of cache warm-up you should design >>>>>> your Cassandra table specific way - there should be some mapping from >>>>>> Ignite partition to the set of Cassandra partitions. Yes I have plans to >>>>>> implement this. >>>>>> >>>>>> Igor Rudyak >>>>>> >>>>>> >>>>>> On Thu, Oct 6, 2016 at 10:19 AM, Valentin Kulichenko < >>>>>> valentin.kuliche...@gmail.com> wrote: >>>>>> >>>>>>> Hi Igor, >>>>>>> >>>>>>> I've got couple of quick questions about the Cassandra store. >>>>>>> >>>>>>> 1. In [1] you suggested to provide an explicit query as a >>>>>>> parameter for loadCache() method, because otherwise user was always >>>>>>> getting >>>>>>> empty result. Is this a requirement to provide the query? What if I >>>>>>> just >>>>>>> call loadCache(null)? >>>>>>> 2. There is a ticket [2] about parallel load in Cassandra store. >>>>>>> Does it mean that currently it loads only in a single threaded >>>>>>> fashion? If >>>>>>> so, do you have any plans to implement this improvement? >>>>>>> >>>>>>> [1] http://apache-ignite-users.70518.x6.nabble.com/Cannot-qu >>>>>>> ery-on-a-cache-using-Cassandra-as-a-persistent-store-td7870.html >>>>>>> [2] https://gridgain.freshdesk.com/helpdesk/tickets/2180 >>>>>>> >>>>>>> Thanks, >>>>>>> Val >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >