Re: Materialization performance

Julian Hyde Tue, 29 Aug 2017 10:40:30 -0700

> I'd rather have immutable state being CASed(compare-and-swap) to make
> the querying cheap and do updates in an optimistic concurrency control manner.


Compare and swap only works for one memory address. You can't use it
to, say, debit one bank account and credit another.

The set of valid materializations is just about the only mutable state
in Calcite and I think it will need to be several interconnected data
structures. So, compare-and-swap (or its high-level equivalent,
ConcurrentHashMap) won't cut it.

So we could use locks/monitors (the "synchronized" keyword) or we
could use an actor. The key difference between the two is who does the
work. With a monitor, each customer grabs the key (there is only one
key), walks into the bank vault, and moves the money from one deposit
box to another. With an actor, there is a bank employee in the vault
who is the only person allowed to move money around.

The work done is the same in both models. There are performance
advantages of the actor model (the data structures will tend to exist
in one core's cache) and there are code simplicity advantages (the
critical code is all in one class or package).

The overhead of two puts/gets on an ArrayBlockingQueue per request is
negligible. And besides, you can switch to a non-actor implementation
of the service if Calcite is single-threaded.

I haven't thought out the details of multi-tenant. It is not true to
say that this is "not a primary requirement for
the Calcite project." Look at the "data grid (cache)" on the diagram
in my "Optiq" talk [1] from 2013. Dynamic materialized views were in
from the very start. There can be multiple instances of the actor
(each with their own request/response queues), so you could have one
per tenant. Also, it is very straightforward to make the actors
remote, replacing the queues with RPC over a message broker. Remote
actors are called services.

Julian

[1] 
https://www.slideshare.net/julianhyde/optiq-a-dynamic-data-management-framework

On Tue, Aug 29, 2017 at 8:25 AM, Jesus Camacho Rodriguez
<[email protected]> wrote:
> LGTM, I think by the time we have support for the outer joins, I might have
> had time to finish the filter tree index implementation too.
>
> -Jesús
>
>
>
> On 8/29/17, 3:11 AM, "Christian Beikov" <[email protected]> wrote:
>
>>I'd like to stick to trying to figure out how to support outer joins for
>>now and when I have an implementation for that, I'd look into the filter
>>tree index if you haven't done it by then.
>>
>>
>>Mit freundlichen Grüßen,
>>------------------------------------------------------------------------
>>*Christian Beikov*
>>Am 28.08.2017 um 20:01 schrieb Jesus Camacho Rodriguez:
>>> Christian,
>>>
>>> The implementation of the filter tree index is what I was referring to
>>> indeed. In the initial implementation I focused on the rewriting coverage,
>>> but now that the first part is finished, it is at the top of my list as
>>> I think it is critical to make the whole query rewriting algorithm work
>>> at scale. However, I have not started yet.
>>>
>>> The filter tree index will help to filter not only based on the tables used
>>> by a given query, but also for queries that do not meet the equivalence
>>> classes conditions, filter conditions, etc. We could implement all the
>>> preconditions mentioned in the paper, and we could add our own additional
>>> ones. I also think that in a second version, we might need to maybe add
>>> some kind of ranking/limit as many views might meet the preconditions for
>>> a given query.
>>>
>>> It seems you understood how it should work, so if you could help to
>>> quickstart that work by maybe implementing a first version of the filter
>>> tree index with a couple of basic conditions (table matching and EC 
>>> matching?),
>>> that would be great. I could review any of the contributions you make.
>>>
>>> -Jesús
>>>
>>>
>>>
>>>
>>>
>>> On 8/28/17, 3:22 AM, "Christian Beikov" <[email protected]> wrote:
>>>
>>>> If the metadata was cached, that would be awesome, especially because
>>>> that would also improve the prformance regarding the metadata retrival
>>>> for the query currently being planned, although I am not sure how the
>>>> caching would work since the RelNodes are mutable.
>>>>
>>>> Have you considered implementing the filter tree index explained in the
>>>> paper? As far as I understood, the whole thing only works when a
>>>> redundant table elimination is implemented. Is that the case? If so, or
>>>> if it can be done easily, I'd propose we initialize all the lookup
>>>> structures during registration and use them during planning. This will
>>>> improve planning time drastically and essentially handle the scalability
>>>> problem you mention.
>>>>
>>>> What other MV-related issues are on your personal todo list Jesus? I
>>>> read the paper now and think I can help you in one place or another if
>>>> you want.
>>>>
>>>>
>>>> Mit freundlichen Grüßen,
>>>> ------------------------------------------------------------------------
>>>> *Christian Beikov*
>>>> Am 28.08.2017 um 08:13 schrieb Jesus Camacho Rodriguez:
>>>>> Hive does not use the Calcite SQL parser, thus we follow a different path
>>>>> and did not experience the problem on the Calcite end. However, FWIW we
>>>>> avoided reparsing the SQL every time a query was being planned by
>>>>> creating/managing our own cache too.
>>>>>
>>>>> The metadata providers implement some caching, thus I would expect that 
>>>>> once
>>>>> you avoid reparsing every MV, the retrieval time of predicates, lineage, 
>>>>> etc.
>>>>> would improve (at least after using the MV for the first time). However,
>>>>> I agree that the information should be inferred when the MV is loaded.
>>>>> In fact, maybe just making some calls to the metadata providers while the 
>>>>> MVs
>>>>> are being loaded would do the trick (Julian should confirm this).
>>>>>
>>>>> Btw, probably you will find another scalability issue as the number of MVs
>>>>> grows large with the current implementation of the rewriting, since the´
>>>>> pre-filtering implementation in place does not discard many of the views 
>>>>> that
>>>>> are not valid to rewrite a given query, and rewriting is attempted with 
>>>>> all
>>>>> of them.
>>>>> This last bit is work that I would like to tackle shortly, but I have not
>>>>> created the corresponding JIRA yet.
>>>>>
>>>>> -Jesús
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 8/27/17, 10:43 PM, "Rajat Venkatesh" <[email protected]> wrote:
>>>>>
>>>>>> Thread Safety and repeated parsing is a problem. We have experience with
>>>>>> managing 10s of materialized views. Repeated parsing takes more time than
>>>>>> execution of the query itself. We also have a similar problem where
>>>>>> concurrent queries (with a different set of materialized views 
>>>>>> potentailly)
>>>>>> maybe planned at the same time. We solved it through maintaining a cache
>>>>>> and carefully setting the cache in a thread local.
>>>>>> Relevant code for inspiration:
>>>>>> https://github.com/qubole/quark/blob/master/optimizer/src/main/java/org/apache/calcite/prepare/Materializer.java
>>>>>> https://github.com/qubole/quark/blob/master/optimizer/src/main/java/org/apache/calcite/plan/QuarkMaterializeCluster.java
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Aug 27, 2017 at 6:50 PM Christian Beikov 
>>>>>> <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hey, I have been looking a bit into how materialized views perform
>>>>>>> during the planning because of a very long test
>>>>>>> run(MaterializationTest#testJoinMaterializationUKFK6) and the current
>>>>>>> state is problematic.
>>>>>>>
>>>>>>> CalcitePrepareImpl#getMaterializations always reparses the SQL and down
>>>>>>> the line, there is a lot of expensive work(e.g. predicate and lineage
>>>>>>> determination) done during planning that could easily be pre-calculated
>>>>>>> and cached during materialization creation.
>>>>>>>
>>>>>>> There is also a bit of a thread safety problem with the current
>>>>>>> implementation. Unless there is a different safety mechanism that I
>>>>>>> don't see, the sharing of the MaterializationService and thus also the
>>>>>>> maps in MaterializationActor via a static instance between multiple
>>>>>>> threads is problematic.
>>>>>>>
>>>>>>> Since I mentioned thread safety, how is Calcite supposed to be used in a
>>>>>>> multi-threaded environment? Currently I use a connection pool that
>>>>>>> initializes the schema on new connections, but that is not really nice.
>>>>>>> I suppose caches are also bound to the connection? A thread safe context
>>>>>>> that can be shared between connections would be nice to avoid all that
>>>>>>> repetitive work.
>>>>>>>
>>>>>>> Are these known issues which you have thought about how to fix or should
>>>>>>> I log JIRAs for these and fix them to the best of my knowledge? I'd more
>>>>>>> or less keep the service shared but would implement it using a copy on
>>>>>>> write strategy since I'd expect seldom schema changes after startup.
>>>>>>>
>>>>>>> Regarding the repetitive work that partly happens during planning, I'd
>>>>>>> suggest doing that during materialization registration instead like it
>>>>>>> is already mentioned CalcitePrepareImpl#populateMaterializations. Would
>>>>>>> that be ok?
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Mit freundlichen Grüßen,
>>>>>>> ------------------------------------------------------------------------
>>>>>>> *Christian Beikov*
>>>>>>>
>>
>

Re: Materialization performance

Reply via email to