Re: [DISCUSS] PIP-10: Introduce Paimon QueryService

Yong Fang Wed, 11 Oct 2023 18:45:03 -0700

Thanks Jingsong for initiating this discussion, I'm big +1 for it. Building
a query service based on Paimon is a very exciting feature, which not only
simplifies the user's online data service architecture, but also serves as
a dim service for the streaming process. In addition, we can continue to
improve olap capability for Paimon based on this.


The POC PR [1] is overall nice to me, it builds LSM data storage locally
and also sets up a data service in Flink which is a streaming job.
QueryService is a big work, I think we can start with this PIP and the
basic abilities in the PR [1]. After that, we can continue to improve data
partitioning, service discovery, client&SDK, and more performance and
stability relevant features.

Looking forward to QueryService based on Paimon, thanks!

[1] https://github.com/apache/incubator-paimon/pull/2110

On Wed, Oct 11, 2023 at 4:07 PM Ming Li <[email protected]> wrote:

> >
> > Yes, I think there must be a primary key, we can compute the bucket
> > from the primary key, and find which executor to visit.
> > This is the primary key Query Service.
> >
>
> hi, Jingsong, thank you for providing this explanation.  It looks good to
> me for the first version only supports lookup based on primary keys, and
> most of our scenarios also use lookups based on primary keys.
>
> But there are other problems, when the number of executors changes, the
> > service needs to be restarted and the data needs to be loaded.
> >
>
> hi, jufang, I think this is not a big problem. For the Query Service, the
> first version may be embedded in flink job, and high availability depends
> on the implementation of Flink.
>
> Best,
> Ming Li
>
>
> jufang he <[email protected]> 于2023年10月10日周二 17:20写道：
>
> > Hi, Ming.
> >
> > As Xiangyu mentioned, we encountered the same problem when implementing a
> > similar solution in ByteDance, maybe I can share some experience.
> >
> > The small table has more than 100g data, so it needs to be placed on
> > separate nodes.  To solve the problem of getting Executor addresses
> during
> > RPC queries, The same hash rules are used in data generation, loading and
> > querying. When the data is generated, the data is written to different
> > directories according to the hash algorithm. When data is loaded into
> > executors, the same hash algorithm is used, and the number of executors
> is
> > set in advance, and the data is loaded into different executors. Since a
> > single Executor can still exceed the memory limit, we put the data into a
> > local KV store. When dealing with large tables, the Executor number of
> the
> > current key to be queried can be calculated according to the same hash
> > algorithm and the number of executors set in advance. Based on the
> Executor
> > number we can get the Executor RPC address from ZK.
> >
> >
> > But there are other problems, when the number of executors changes, the
> > service needs to be restarted and the data needs to be loaded.
> >
> > Best,
> >
> > Jufang
> >
> > Jingsong Li <[email protected]> 于2023年10月10日周二 16:40写道：
> >
> > > Hi Ming.
> > >
> > > Yes, I think there must be a primary key, we can compute the bucket
> > > from the primary key, and find which executor to visit.
> > > This is the primary key Query Service.
> > >
> > > And then, maybe we can introduce more Query Service types, maybe
> > > another service can be Secondary indexed Query Service, it can be
> > > queried by another field to get primary key, (maybe use RocksDB to
> > > maintain the index) and query primary key Query Service to get the
> > > whole value.
> > >
> > > The Secondary indexed Query Service and Primary Key Query Service are
> > > independent and unrelated, but then, we can use Snapshot Id to do some
> > > consistent alignment work. But this should be more complicated.
> > >
> > > These things can be imaged, but need lots of work.
> > >
> > > I just created a POC for first version, it is very rough:
> > > https://github.com/apache/incubator-paimon/pull/2110
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Tue, Oct 10, 2023 at 3:36 PM Ming Li <[email protected]>
> wrote:
> > > >
> > > > Thanks for the proposal!
> > > > It is a common scenario for multiple applications to share the same
> > > > dimension table. As described in the design document, the TableQuery
> > > client
> > > > will obtain the addresses of all Executors from the AddressServer and
> > > then
> > > > request them through RPC. I have a question about this: How does the
> > > > TableQuery client decide which Executor to request?  Request all
> > > Executors
> > > > in turn? Or is it restricted that the key of lookup must contain
> > > bucket-key?
> > > >
> > > > Best,
> > > > Ming Li
> > > >
> > > >
> > > > Jingsong Li <[email protected]> 于2023年10月8日周日 18:35写道：
> > > >
> > > > > Hi all,
> > > > >
> > > > > I want to bring up a discussion about Paimon QueryService [1].
> > > > >
> > > > > Paimon primary key table already provides LSM file structure, it
> is a
> > > > > pity that the paimon can not provide a queryable service for
> lookup.
> > > > >
> > > > > A distributed service can download Paimon files locally and
> provide a
> > > > > Lookup service. It does not affect the write process and read
> > process,
> > > > > it is a separate server. It can be used as:
> > > > >
> > > > > 1. Flink Lookup Join, reuse by multiple Flink Jobs.
> > > > > 2. Online Service Lookup, this requires high stability. (it may not
> > be
> > > > > so stable in the first version)
> > > > >
> > > > > See more in PIP [1].
> > > > >
> > > > > This PIP is a high-level design for Paimon QueryService, not
> > including
> > > > > all details.
> > > > >
> > > > > [1]
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/PAIMON/PIP-10%3A+Introduce+Paimon+QueryService
> > > > >
> > > > > Best,
> > > > > Jingsong
> > > > >
> > >
> >
>

Re: [DISCUSS] PIP-10: Introduce Paimon QueryService

Reply via email to