Re: The note of the round table meeting after HBaseConAsia 2019

Andrew Purtell Thu, 08 Aug 2019 13:38:13 -0700

Ok, in that spirit let me say I've always found Apache Trafodion to be
interesting and credible technology and worthy of anyone's consideration.



On Thu, Aug 8, 2019 at 1:34 PM Rohit Jain <[email protected]> wrote:

> Andrew,
>
> I would never dump on Apache Phoenix.  I have worked with James for years
> and have always wanted to see how we could collaborate on various aspects,
> including common data type support and transaction management, to name a
> few.  I think the challenges we faced is the Java vs C++ nature of the two
> projects.  I am just pointing out that Apache Trafodion is an alternate
> option available.  I am also letting people know what is NOT in Apache
> Trafodion, so they understand that before making the time investment.
>
> Yes, I do apologize it sounds a bit like marketing, even though I tried to
> minimize that.  But you will see that we have had no marketing at all
> elsewhere.  One of the reasons why no one seems to know about Apache
> Trafodion.
>
> Rohit
>
> -----Original Message-----
> From: Andrew Purtell <[email protected]>
> Sent: Thursday, August 8, 2019 1:25 PM
> To: Hbase-User <[email protected]>
> Cc: HBase Dev List <[email protected]>
> Subject: Re: The note of the round table meeting after HBaseConAsia 2019
>
> This is great, but in the future please refrain from borderline marketing
> of a commercial product on these lists. This is not the appropriate venue
> for that.
>
> It is especially poor form to dump on a fellow open source project, as you
> claim to be. This I think is the tell behind the commercial motivation.
>
> Also I should point out, being pretty familiar with Phoenix in operation
> where I work, and in my interactions with various Phoenix committers and
> PMC, that the particular group of HBasers in that group appeared to share a
> negative view - which I will not comment on, they are entitled to their
> opinions, and more choice in SQL access to HBase is good! - that should not
> be claimed to be universal or even representative.
>
>
>
> On Thu, Aug 8, 2019 at 9:42 AM Rohit Jain <[email protected]> wrote:
>
> > Hi folks,
> >
> > This is a nice write-up of the round-table meeting at HBaseConAsia.  I
> > would like to address the points I have pulled out from write-up (at
> > the bottom of this message).
> >
> > Many in the HBase community may not be aware that besides Apache
> > Phoenix, there has been a project called Apache Trafodion, contributed
> > by Hewlett-Packard in 2015 that has now been top-level project for a
> while.
> > Apache Trafodion is essentially technology from Tandem-Compaq-HP that
> > started its OLTP / Operational journey as NonStop SQL effectively in
> > the early 1990s.  Granted it is a C++ project, but it has 170+ patents
> > as part of it that were contributed to Apache.  These are capabilities
> > that still don’t exist in other databases.
> >
> > It is a full-fledged SQL relational database engine with the breadth
> > of ANSI SQL support, including OLAP functions mentioned, and including
> > many de facto standard functions from databases like Oracle.  You can
> > go to the Apache Trafodion wiki to see the documentation as to what
> > all is supported by Trafodion.
> >
> > When we introduced Apache Trafodion, we implemented a completely
> > distributed transaction management capability right into the HBase
> > engine using coprocessors, that is completely scalable with no
> > bottlenecks what-so-ever.  We have made this infrastructure very
> > efficient over time, e.g. reducing two-phase commit overhead for
> > single region transactions.  We have presented this at HBaseCon.
> >
> > The engine also supports secondary indexes.  However, because of our
> > Multi-dimensional Access Method patented technology the need to use a
> > secondary index is substantially reduced.  All DDL and index updates
> > are completely protected by ACID transactions.
> >
> > Probably because of our own inability to create excitement about the
> > project, and potentially other reasons, we could not get community
> > involvement as we were expecting.  That is why you may see that while
> > we are maintaining the code base and introducing enhancements to it,
> > much of our focus has shifted to the commercial product based on
> > Apache Trafodion, namely EsgynDB.  But if the community involvement
> > increases, we can certainly refresh Trafodion with some of the
> > additional functionality we have added on the HBase side of the product.
> >
> > But let me be clear.  We are about 150 employees at Esgyn with 40 or
> > so in the US, mostly in Milpitas, and the rest in Shanghai, Beijing,
> > and Guiyang.  We cannot sustain the company on service revenue alone.
> > You have seen companies that tried to do that have not been
> > successful, unless they have a way to leverage the open source project
> > for a different business model – enhanced capabilities, Cloud services,
> etc.
> >
> > To that end we have added to EsgynDB complete Disaster Recovery,
> > Point-in-Time, fuzzy Backup and Restore, Manageability via a Database
> > Manager, Multi-tenancy, and a large number of other capabilities for
> > High Availability scale-out production deployments.  EsgynDB also
> > provides full BI and Analytics capabilities, again because of our
> > heritage products supporting up to 250TB EDWs for HP and customers
> > like Walmart competing with Teradata, leveraging Apache ORC and
> > Parquet.  So yes, it can integrate with other storage engines as needed.
> >
> > However, in spite of all this, the pricing on EsgynDB is very
> > competitive – in other words “cheap” compared to anything else with
> > the same caliber of capabilities.
> >
> > We have demonstrated the capability of the product by running the
> > TPC-C and TPC-DS (all 99 queries) benchmarks, especially at high
> > concurrency which our product is especially well suited for, based on
> > its architecture and patents.  (The TPC-DS benchmarks are run on ORC
> > and Parquet for obvious
> > reasons.)
> >
> > We just closed a couple of very large Core Banking deals in Guiyang
> > where we are replacing the entire Core Banking system for these banks
> > from their current Oracle implementations – where they were having
> > challenges scaling at a reasonable cost.  But we have many customers
> > both in the US and China that are using EsgynDB for operational, BI
> > and Analytics needs.  And now finally … OLTP.
> >
> > I know that this is sounding more like a commercial for Esgyn, but
> > that is not my intent.  I would like to make you aware of Apache
> > Trafodion as a solution to many of these issues that the community is
> > facing.  We will provide full support for Trafodion with community
> > involvement and hope that some of that involvement results in EsgynDB
> > revenue that we can sustain the company on 😊.  I would like to
> > encourage the community to look at Trafodion to address many of the
> concerns sighted below.
> >
> > “Allan Yang said that most of their customers want secondary index,
> > even more than SQL. And for global strong consistent secondary index,
> > we agree that the only safe way is to use transaction. Other 'local'
> > solutions will be in trouble when splitting/merging.”
> >
> > “We talked about Phoenix, the problem for Phoenix is well known: not
> > stable enough. We even had a user on the mailing-list said he/she will
> > never use Phoenix again.”
> >
> > “Some guys said that the current feature set for 3.0.0 is not good
> > enough to attract more users, especially for small companies. Only
> > internal improvements, no users visible features. SQL and secondary
> > index are very important.”
> >
> > “Then we back to SQL again. Alibaba said that most of their customers
> > are migrate from old business, so they need 'full' SQL support. That's
> > why they need Phoenix. And lots of small companies wants to run OLAP
> > queries directly on the database, they do no want to use ETL. So maybe
> > in the SQL proxy (planned above), we should delegate the OLAP queries
> > to spark SQL or something else, rather than just rejecting them.”
> >
> > “And a Phoenix committer said that, the Phoenix community are
> > currently re-evaluate the relationship with HBase, because when
> > upgrading to HBase 2.1.x, lots of things are broken. They plan to
> > break the tie between Phoenix and HBase, which means Phoenix plans to
> > also run on other storage systems. Note: This is not on the meeting
> > but personally, I think this maybe a good news, since Phoenix is not
> > HBase only, we have more reasons to introduce our own SQL layer.”
> >
> > Rohit Jain
> > CTO
> > Esgyn
> >
> >
> >
> > -----Original Message-----
> > From: Stack <[email protected]>
> > Sent: Friday, July 26, 2019 12:01 PM
> > To: HBase Dev List <[email protected]>
> > Cc: hbase-user <[email protected]>
> > Subject: Re: The note of the round table meeting after HBaseConAsia
> > 2019
> >
> >
> >
> > External
> >
> >
> >
> > Thanks for the thorough write-up Duo. Made for a good read....
> >
> > S
> >
> >
> >
> > On Fri, Jul 26, 2019 at 6:43 AM 张铎(Duo Zhang) <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >
> >
> > > The conclusion of the HBaseConAsia 2019 will be available later. And
> >
> > > here is the note of the round table meeting after the conference. A
> > > bit
> > long...
> >
> > >
> >
> > > First we talked about splittable meta. At Xiaomi we have a cluster
> >
> > > which has nearly 200k regions and meta is very easy to overload and
> >
> > > can not recover. Anoop said we can try read replica, but agreed that
> >
> > > read replica can not solve all the problems, finally we still need
> > > to
> > split meta.
> >
> > >
> >
> > > Then we talked about SQL. Allan Yang said that most of their
> > > customers
> >
> > > want secondary index, even more than SQL. And for global strong
> >
> > > consistent secondary index, we agree that the only safe way is to
> > > use
> > transaction.
> >
> > > Other 'local' solutions will be in trouble when splitting/merging.
> >
> > > Xiaomi has an global secondary index solution, open source it?
> >
> > >
> >
> > > Then we back to SQL. We talked about Phoenix, the problem for
> > > Phoenix
> >
> > > is well known: not stable enough. We even had a user on the
> >
> > > mailing-list said he/she will never use Phoenix again. Alibaba and
> >
> > > Huawei both have their in-house SQL solution, and Huawei also talked
> >
> > > about it on HBaseConAsia 2019, they will try to open source it. And
> > > we
> >
> > > could introduce a SQL proxy in hbase-connector repo. No push down
> >
> > > support first, all logics are done at the proxy side, can optimize
> later.
> >
> > >
> >
> > > Some guys said that the current feature set for 3.0.0 is not good
> >
> > > enough to attract more users, especially for small companies. Only
> >
> > > internal improvements, no users visible features. SQL and secondary
> >
> > > index are very important.
> >
> > >
> >
> > > Yu Li talked about the CCSMap, we still want it to be release in
> >
> > > 3.0.0. One problem is the relationship with in memory compaction.
> >
> > > Theoretically they should have no conflicts but actually they have.
> >
> > > And Xiaomi guys mentioned that in memory compaction still has some
> >
> > > bugs, even for basic mode, the MVCC writePoint may be stuck and hang
> >
> > > the region server. And Jieshan Bi asked why not just use CCSMap to
> >
> > > replace CSLM. Yu Li said this is for better memory usage, the index
> > > and
> > data could be placed together.
> >
> > >
> >
> > > Then we started to talk about the HBase on cloud. For now, it is a
> > > bit
> >
> > > difficult to deploy HBase on cloud as we need to deploy zookeeper
> > > and
> >
> > > HDFS first. Then we talked about the HBOSS and WAL
> > abstraction(HBASE-209520.
> >
> > > Wellington said the HBOSS basicly works, it use s3a and zookeeper to
> >
> > > help simulating the operations of HDFS. We could introduce our own
> > 'FileSystem'
> >
> > > interface, not the hadoop one, and we could remove the 'atomic
> renaming'
> >
> > > dependency so the 'FileSystem' implementation will be easier. And on
> >
> > > the WAL abstraction, Wellington said there are still some guys
> > > working
> >
> > > it, but now they focus on patching ratis, rather than abstracting
> > > the
> >
> > > WAL system first. We agreed that a better way is to abstract WAL
> >
> > > system at a level higher than FileSystem. so maybe we could even use
> > Kafka to store the WAL.
> >
> > >
> >
> > > Then we talked about the FPGA usage for compaction at Alibaba.
> > > Jieshan
> >
> > > Bi said that in Huawei they offload the compaction to storage layer.
> >
> > > For open source solution, maybe we could offload the compaction to
> >
> > > spark, and then use something like bulkload to let region server
> > > load
> >
> > > the new HFiles. The problem for doing compaction inside region
> > > server
> >
> > > is the CPU cost and GC pressure. We need to scan every cell so the
> > > CPU
> >
> > > cost is high. Yu Li talked about their page based compaction in
> > > flink
> >
> > > state store, maybe it could also benefit HBase.
> >
> > >
> >
> > > Then it is the time for MOB. Huawei said MOD can not solve their
> problem.
> >
> > > We still need to read the data through RPC, and it will also
> > > introduce
> >
> > > pressures on the memstore, since the memstore is still a bit small,
> >
> > > comparing to MOB cell. And we will also flush a lot although there
> > > are
> >
> > > only a small number of MOB cells in the memstore, so we still need
> > > to
> >
> > > compact a lot. So maybe the suitable scenario for using MOB is that,
> >
> > > most of your data are still small, and a small amount of the data
> > > are
> >
> > > a bit larger, where MOD could increase the performance, and users do
> >
> > > not need to use another system to store the larger data.
> >
> > > Huawei said that they implement the logic at client side. If the
> > > data
> >
> > > is larger than a threshold, the client will go to another storage
> >
> > > system rather than HBase.
> >
> > > Alibaba said that if we want to support large blob, we need to
> >
> > > introduce streaming API.
> >
> > > And Kuaishou said that they do not use MOB, they just store data on
> >
> > > HDFS and the index in HBase, typical solution.
> >
> > >
> >
> > > Then we talked about which company to host the next year's
> >
> > > HBaseConAsia. It will be Tencent or Huawei, or both, probably in
> >
> > > Shenzhen. And since there is no HBaseCon in America any more(it is
> >
> > > called 'NoSQL Day'), maybe next year we could just call the
> > > conference
> > HBaseCon.
> >
> > >
> >
> > > Then we back to SQL again. Alibaba said that most of their customers
> >
> > > are migrate from old business, so they need 'full' SQL support.
> > > That's
> >
> > > why they need Phoenix. And lots of small companies wants to run OLAP
> >
> > > queries directly on the database, they do no want to use ETL. So
> > > maybe
> >
> > > in the SQL proxy(planned above), we should delegate the OLAP queries
> >
> > > to spark SQL or something else, rather than just rejecting them.
> >
> > >
> >
> > > And a Phoenix committer said that, the Phoenix community are
> > > currently
> >
> > > re-evaluate the relationship with HBase, because when upgrading to
> >
> > > HBase 2.1.x, lots of things are broken. They plan to break the tie
> >
> > > between Phoenix and HBase, which means Phoenix plans to also run on
> >
> > > other storage systems.
> >
> > > Note: This is not on the meeting but personally, I think this maybe
> > > a
> >
> > > good news, since Phoenix is not HBase only, we have more reasons to
> >
> > > introduce our own SQL layer.
> >
> > >
> >
> > > Then we talked about Kudu. It is faster than HBase on scan. If we
> > > want
> >
> > > to increase the performance on scan, we should have larger block
> > > size,
> >
> > > but this will lead to a slower random read, so we need to trade-off.
> >
> > > The Kuaishou guys asked whether HBase could support storing HFile in
> >
> > > columnar format. The answer is no, as said above, it will slow
> > > random
> > read.
> >
> > > But we could learn what google done in bigtable. We could write a
> > > copy
> >
> > > of the data in parquet format to another FileSystem, and user could
> >
> > > just scan the parquet file for better analysis performance. And if
> >
> > > they want the newest data, they could ask HBase for the newest data,
> >
> > > and it should be small. This is more like a solution, not only HBase
> >
> > > is involved. But at least we could introduce some APIs in HBase so
> >
> > > users can build the solution in their own environment. And if you do
> >
> > > not care the newest data, you could also use replication to
> > > replicate
> >
> > > the data to ES or other systems, and search there.
> >
> > >
> >
> > > And Didi talked about their problems using HBase. They use kylin so
> >
> > > they also have lots of regions, so meta is also a problem for them.
> >
> > > And the pressure on zookeeper is also a problem, as the replication
> >
> > > queues are stored on zk. And after 2.1, zookeeper is only used as an
> >
> > > external storage in replication implementation, so it is possible to
> >
> > > switch to other storages, such as etcd. But it is still a bit
> >
> > > difficult to store the data in a system table, as now we need to
> > > start
> >
> > > the replication system before WAL system, but  if we want to store
> > > the
> >
> > > replication data in a hbase table, obviously the WAL system must be
> >
> > > started before replication system, as we need the region of the
> > > system
> >
> > > online first, and it will write an open marker to WAL. We need to
> > > find a
> > way to break the dead lock.
> >
> > > And they also mentioned that, the rsgroup feature also makes big
> > > znode
> >
> > > on zookeeper, as they have lots of tables. We have HBASE-22514 which
> >
> > > aims to solve the problem.
> >
> > > And last, they shared their experience when upgrading from 0.98 to
> 1.4.x.
> >
> > > they should be compatible but actually there are problems. They
> > > agreed
> >
> > > to post a blog about this.
> >
> > >
> >
> > > And the Flipkart guys said they will open source their test-suite,
> >
> > > which focus on the consistency(Jepsen?). This is a good news, hope
> > > we
> >
> > > could have another useful tool other than ITBLL.
> >
> > >
> >
> > > That's all. Thanks for reading.
> >
> > >
> >
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>    - A23, Crosstalk
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: The note of the round table meeting after HBaseConAsia 2019

Reply via email to