Re: Why we change index state to PENDING_DISABLE on RegionMovedException

2019-09-10 Thread Vincent Poon
Normally you're right, this should get retried at the HBase layer and would
be transparent.  However as part of PHOENIX-4130, we have the hbase client
only try the write once, so there's no chance to retry.  We did that to
avoid tying up rpc handlers on the server.
Instead, we retry the entire Phoenix mutation from the client side.  The
index is put into "PENDING_DISABLE", so that if the next write succeeds, it
can flip back to "ACTIVE".

On Tue, Sep 10, 2019 at 2:29 PM Alexander Batyrshin <0x62...@gmail.com>
wrote:

> As I know RegionMovedException is not a problem at all, its just
> notification that we need to update meta information about table regions
> and retry.
> Why we do extra work with changing state of index?
>
> 2019-09-10 22:35:00,764 WARN  [hconnection-0x4a63b6ea-shared--pool10-t961]
> client.AsyncProcess: #41, table=IDX_TABLE, attempt=1/1 failed=1ops, last
> exception: org.apache.hadoop.hbase.exceptions.RegionMovedException: Region
> moved to: hostname=prod023 port=60020 startCode=1568139705179. As
>  of locationSeqNum=93740117. on prod027,60020,1568142287280, tracking
> started Tue Sep 10 22:35:00 MSK 2019; not retrying 1 - final failure
> 2019-09-10 22:35:00,789 INFO
> [RpcServer.default.FPBQ.Fifo.handler=170,queue=10,port=60020]
> index.PhoenixIndexFailurePolicy: Successfully update
> INDEX_DISABLE_TIMESTAMP for IDX_TABLE due to an exception while writing
> updates. indexState=PENDING_DISABLE
> org.apache.phoenix.hbase.index.exception.MultiIndexWriteFailureException:
> disableIndexOnFailure=true, Failed to write to multiple index tables:
> [IDX_TABLE]
> at
> org.apache.phoenix.hbase.index.write.TrackingParallelWriterIndexCommitter.write(TrackingParallelWriterIndexCommitter.java:236)
> at
> org.apache.phoenix.hbase.index.write.IndexWriter.write(IndexWriter.java:195)
> at
> org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:156)
> at
> org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:145)
> at
> org.apache.phoenix.hbase.index.Indexer.doPostWithExceptions(Indexer.java:614)
> at org.apache.phoenix.hbase.index.Indexer.doPost(Indexer.java:589)
> at
> org.apache.phoenix.hbase.index.Indexer.postBatchMutateIndispensably(Indexer.java:572)
> at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$37.call(RegionCoprocessorHost.java:1048)
> at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1711)
> at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1789)
> at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1745)
> at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutateIndispensably(RegionCoprocessorHost.java:1044)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3677)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3138)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3080)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:916)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:844)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2406)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:36621)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2380)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:297)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:277)


Re: Is there any way to using appropriate index automatically?

2019-08-20 Thread Vincent Poon
check out PHOENIX-5109 , it likely fixes your issue.
Unfortunately it's targeted for 4.15.0 which hasn't been released yet.
Maybe you can backport and see if it works for your query.

On Tue, Aug 20, 2019 at 11:38 AM Ankit Singhal 
wrote:

> CREATE local INDEX local_c_h_index ON test_phoenix.app (c,h) ASYNC;
> (Has been filled data with bulkload and index is active)
>
> Query:
> select /*+ INDEX(test_phoenix.app local_c_h_index) */ * from
> TEST_PHOENIX.APP where c=2 and h = 1 limit 5;
> select * from TEST_PHOENIX.APP where c=2 and h = 1 limit 5;
>
> The first query will use index local_c_h_index and result shortly, the
> second query won’t , and response slowly.
>
> Yeah, the local index should be used in both the cases, looks like a bug
> to me, can you please raise a JIRA in Phoenix project for the same.
> QueryOptimizer.java may have a relevant code to fix the issue, so the patch
> would really be appreciated.
>
> And, also can you try running "select a,b,c,d,e,f,g,h,i,j,k,m,n from
> TEST_PHOENIX.APP where c=2 and h = 1 limit 5", and see if index is getting
> used.
>
> Regards,
> Ankit Singhal
>
> On Tue, Aug 20, 2019 at 1:49 AM you Zhuang 
> wrote:
>
>> Er, I also read the sentence “Unlike global indexes, local indexes *will* use
>> an index even when all columns referenced in the query are not contained in
>> the index. This is done by default for local indexes because we know that
>> the table and index data co-reside on the same region server thus ensuring
>> the lookup is local.”
>>
>> I ‘m totally confused.
>>
>>
>> On Aug 20, 2019, at 12:32 AM, Josh Elser  wrote:
>>
>> http://phoenix.apache.org/faq.html#Why_isnt_my_secondary_index_being_used
>>
>> On 8/19/19 6:06 AM, you Zhuang wrote:
>>
>> Phoenix-version: 4.14.3-HBase-1.4-SNAPSHOT
>> hbase-version: 1.4.6
>> Table:
>> CREATE TABLE test_phoenix.app (
>> dt integer not null,
>> a bigint not null ,
>> b bigint not null ,
>> c bigint not null ,
>> d bigint not null ,
>> e bigint not null ,
>> f bigint not null ,
>> g bigint not null ,
>> h bigint not null ,
>> i bigint not null ,
>> j bigint not null ,
>> k bigint not null ,
>> m decimal(30,6) ,
>> n decimal(30,6)
>> CONSTRAINT pk PRIMARY KEY (dt, a,b,c,d,e,f,g,h,i,j,k)
>> ) SALT_BUCKETS = 3,UPDATE_CACHE_FREQUENCY = 30;
>> Index:
>> CREATE local INDEX local_c_h_index ON test_phoenix.app (c,h) ASYNC;
>> (Has been filled data with bulkload and index is active)
>> Query:
>> select /*+ INDEX(test_phoenix.app local_c_h_index) */ * from
>> TEST_PHOENIX.APP where c=2 and h = 1 limit 5;
>> select * from TEST_PHOENIX.APP where c=2 and h = 1 limit 5;
>> The first query will use index local_c_h_index and result shortly, the
>> second query won’t , and response slowly.
>> The explain plan is weird, all showing without using index.
>>
>> On Aug 19, 2019, at 5:54 PM, Aleksandr Saraseka > > wrote:
>>
>> We have no problems with that. I mean indexes are used even without
>> hints, if they're suitable for a query.
>> Maybe you can share your Phoenix version, query, index definition and
>> exec plan ?
>>
>> On Mon, Aug 19, 2019 at 12:46 PM you Zhuang > mailto:zhuangzixiao...@gmail.com >> wrote:
>>
>>Yeah, I mean no hint , use appropriate index automatically. I
>>create a local index  and a query with corresponding index column
>>filter in where clause. But the query doesn’t use index, with
>>index hint it uses it.
>>
>>
>>
>> --
>> Aleksandr Saraseka
>> DBA
>> 380997600401
>> > *•* asaras...@eztexting.com <
>> mailto:asaras...@eztexting.com > *•*
>> eztexting.com<
>> http://eztexting.com/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature
>> >
>>
>> <
>> http://facebook.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>
>> <
>> http://linkedin.com/company/eztexting/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>
>> <
>> http://twitter.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>
>> <
>> https://www.youtube.com/eztexting?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>
>> <
>> https://www.instagram.com/ez_texting/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>
>> <
>> https://www.facebook.com/alex.saraseka?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature>
>> <
>> https://www.linkedin.com/in/alexander-saraseka-32616076/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature
>> >
>>
>>
>>


Re: Phoenix Index Scrutiny Tool

2019-08-12 Thread Vincent Poon
@Aleksandr did you delete rows from the data table, or the index table?
The output you're showing says that you have orphaned rows in the index
table - i.e. rows that exist only in the index table and have no
corresponding row in the data table.  If you deleted the original rows in
the data table without deleting the corresponding rows in the index table,
and if major compaction has happened (perhaps what you meant by
"hard-delete" ?), then in general there's no way to rebuild the index
correctly, as there's no source data to work off of.   You might have
special cases where you have an index that covers all the data table rows
such that you could in theory go backwards, but I don't believe we have any
tool to do that yet.

The IndexTool does have a "partial rebuild" option that works in
conjunction with "ALTER INDEX REBUILD ASYNC" - see PHOENIX-2890.  However
this is not well documented, and I haven't personally tried it myself.

On Fri, Aug 9, 2019 at 2:18 PM Alexander Batyrshin <0x62...@gmail.com>
wrote:

> I have familiar question - how to partially rebuild indexes by timestamps
> interval like many MapReduce has —starttime/—endttime
>
> On 9 Aug 2019, at 17:21, Aleksandr Saraseka 
> wrote:
>
> Hello community!
> I'm testing scrutiny tool to check index consistency.
> I hard-deleted from HBase a couple of rows from global index, then ran
> Scrutiny tool, it showed me some output like:
>
>
> SOURCE_TABLE
>
> TARGET_TABLE
>
> SCRUNITY_EXECUTE_TIME
>
> SOURCE_ROW_PK_HASH
>
> SOURCE_TS
>
> TARGET_TS
>
> HAS_TARGET_ROW
>
> INDEX_TABLE
>
> DATA_TABLE
>
> 1565358267566
>
> 8a74d1f8286a7ec7ce99b22ee0723ab1
>
> 1565358171998
>
> -1
>
> false
>
> INDEX_TABLE
>
> DATA_TABLE
>
> 1565358267566
>
> a2cfe11952f3701d340069f80e2a82b7
>
> 1565358135292
>
> -1
>
> false
>
> so, let's imagine that I want to repair my index and don't want to run
> full rebuild (huge table).
>
> What's the best option ?
>
> Two things came to my mind:
>
> - Find a row in data table, and upset necessary data to index table.
>
> - Find a row in data table,  export it then drop it, and then insert it
> again.
>
> And the main question - how can I get a value from data or index table by
> Primary Key hash ?
>
>
> --
> Aleksandr Saraseka
> DBA
> 380997600401
>  *•*  asaras...@eztexting.com  *•*  eztexting.com
> 
>
> 
> 
> 
> 
> 
> 
> 
>
>
>


Re: Query optimization

2019-06-21 Thread Vincent Poon
I wasn't able to repro this from sqlline.  The query seems to setup the
correct scan with two filters: skip-scan and the column value filter.
So I don't know why the join without the filter is fast for you, but with
the filter it's slow.
Anything else special about your tables?  e.g. indexes, stats...

On Wed, Jun 19, 2019 at 6:18 PM Alexander Batyrshin <0x62...@gmail.com>
wrote:

> Is it possible not to full scan table1 for ’table1.col = ?’, but do this
> check only on subset table1.pk IN (…)?
>
> On 19 Jun 2019, at 23:31, Vincent Poon  wrote:
>
> 'table1.col = ?' will be a full table scan of table1 unless you have a
> secondary index on table.col
> Check the explain plan to see if it's working as expected
>
> On Wed, Jun 19, 2019 at 7:43 AM Alexander Batyrshin <0x62...@gmail.com>
> wrote:
>
>>  Hello,
>> We have 2 tables:
>>
>> Table1 - big one (2000M+ rows):
>>
>> CREATE TABLE table1 (
>> pk varchar PRIMARY KEY,
>> col varchar
>> );
>>
>> Table2 - small one (300K rows):
>>
>> CREATE TABLE table2 (
>> pk varchar PRIMARY KEY,
>> other varchar
>> );
>>
>> Query like this work fast (~ 30sec):
>> SELECT table1.pk,  table1.col
>> FROM table1
>> WHERE table1.pk IN ( SELECT table2.pk FROM table2 )
>>
>> But query like this work quite slow (>10min):
>> SELECT table1.pk
>> FROM table1
>> WHERE table1.col = ? AND table1.pk IN ( SELECT table2.pk FROM table2 )
>>
>> Also query below work slow:
>> SELECT *
>> FROM (
>> SELECT table1.pk,  table1.col
>> FROM table1
>> WHERE table1.pk IN ( SELECT table2.pk FROM table2 )
>> ) AS s
>> WHERE s.col = ?
>>
>> Is there any HINT that can optimize query?
>>
>
>


Re: Query optimization

2019-06-19 Thread Vincent Poon
'table1.col = ?' will be a full table scan of table1 unless you have a
secondary index on table.col
Check the explain plan to see if it's working as expected

On Wed, Jun 19, 2019 at 7:43 AM Alexander Batyrshin <0x62...@gmail.com>
wrote:

>  Hello,
> We have 2 tables:
>
> Table1 - big one (2000M+ rows):
>
> CREATE TABLE table1 (
> pk varchar PRIMARY KEY,
> col varchar
> );
>
> Table2 - small one (300K rows):
>
> CREATE TABLE table2 (
> pk varchar PRIMARY KEY,
> other varchar
> );
>
> Query like this work fast (~ 30sec):
> SELECT table1.pk,  table1.col
> FROM table1
> WHERE table1.pk IN ( SELECT table2.pk FROM table2 )
>
> But query like this work quite slow (>10min):
> SELECT table1.pk
> FROM table1
> WHERE table1.col = ? AND table1.pk IN ( SELECT table2.pk FROM table2 )
>
> Also query below work slow:
> SELECT *
> FROM (
> SELECT table1.pk,  table1.col
> FROM table1
> WHERE table1.pk IN ( SELECT table2.pk FROM table2 )
> ) AS s
> WHERE s.col = ?
>
> Is there any HINT that can optimize query?
>


Re: Growth in table size and performance degradation on read-queries

2019-01-29 Thread Vincent Poon
is your max_versions set to 1 ?  keep_deleted_cells?

On Tue, Jan 29, 2019 at 10:41 AM talluri abhishek 
wrote:

> Hi All,
>
> We are seeing a couple of issues on some of our Phoenix tables where the
> size of the tables keep growing 2-3 times after around 2-3 days of
> ingestion and the read performance takes a big hit after that. Now, if we
> insert overwrite the data in that table to a new copy table, the data size
> comes back to normal size and the queries perform fast on that copy table.
>
> Initial table size after 1st day ~ 5G
> After 2 days of ingestion ~ 15G
> Re-write into a copy table ~ 5-6 G
>
> Query performance becomes proportional to the size of the table, lets say
> the query took 40 secs to run on the original table after first day, it
> takes around 130-160 secs after 2 days of ingestion. The same query when
> run on the copy table finishes in around ~40secs.
>
> Most of the ingested data after the first day are mostly updates happening
> on the existing rows, so we thought major compaction should solve the size
> issue but it does not shrink the size every time (load happens in
> parallel when the compaction is run).
> Write performance is always good and we have used salt buckets to even out
> the writes. The primary key is a 12-bit string which is made by the
> concatenation of some account id and an auto-generated transaction number.
>
> One query that has a toll on its performance as mentioned above is:
> *select (list of 50-70 columns) from original_table where account_id IN
> (list of 100k account ids) *[account_id in this query is the primary key
> on that table]
>
> We are currently increasing the heap space on these region servers to
> provide more memstore size, which could reduce the number of flushes for
> the upserted data.
>
> Could there be any other reason for the increase in the size of the table
> apart from the updated rows? How could we better the performance of those
> read queries?
>
> Thanks,
> Abhishek
>


Re: Query All Dynamic Columns

2018-12-26 Thread Vincent Poon
A lot of work is currently going into handling large numbers of views -
splittable syscat, view management, etc... but agree that it's not ideal.

There's currently no built-in way to do what you want AFAIK, but you can
manage the columns yourself in a separate table:
- store them all in a single column value, and read that value before doing
your query.  HBase checkAndMutate for locking.
or
- store each column as separate rows.  Then you can do things like filter
by column name efficiently.
You could 'soft delete' by removing the entries.

Would be a nice improvement to have an option to persist dynamic column
names+types in Phoenix.

On Fri, Dec 21, 2018 at 12:18 PM Clay Baenziger (BLOOMBERG/ 731 LEX) <
cbaenzi...@bloomberg.net> wrote:

> Hello,
>
> A user of mine brought up a question around dynamic columns in Phoenix
> today. The quantity of columns should become asymptotic to a few tends of
> thousands of columns as their data fills in.
>
> The user want to query all columns in a table and they are today thinking
> of using views to do this -- but it is ugly management. They have an
> unbounded number of views -- which will pollute the global catalog and fail
> relatively quickly.
>
> Has anyone thought about the potentially wasteful[1] approach of scanning
> all rows in a query to determine columns and then re-running the query for
> the rows once we know what columns the SQL result will contain. Maybe
> something cleaner like persisting the set of columns in the statistics
> table and a SELECT * may return columns with nothing but nulls. Or, even
> better is there an overall better way to model such a wide schema in
> Phoenix?
>
> -Clay
>
> [1]: Perhaps some heuristics could allow for not needing to do 2n reads in
> all cases?
>


Re: Phoenix perform full scan and ignore covered global index

2018-12-26 Thread Vincent Poon
I'm not able to repro this on latest 4.15.0 :
0: jdbc:phoenix:> explain select * from US_POPULATION where city = 'test';
+-+-++--+
|  PLAN
   | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+-+-++--+
| CLIENT 1-CHUNK PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER
US_POPULATION_COVERED_INDEX ['test']  | null| null   |
null |
+-+-++--+


On Sun, Dec 23, 2018 at 7:56 PM Jaanai Zhang  wrote:

> Could you please show your SQL of the CREATE TABLE/INDEX
>
> 
>Jaanai Zhang
>Best regards!
>
>
>
>
> Batyrshin Alexander <0x62...@gmail.com> 于2018年12月23日周日 下午9:38写道:
>
>> Examples:
>>
>> 1. Ignoring indexes if "*" used for select even index include all columns
>> from source table
>>
>> 0: jdbc:phoenix:127.0.0.1> explain select * from table where "p" =
>> '123123123';
>>
>> +---+-+++
>> | PLAN
>>   | EST_BYTES_READ  | EST_ROWS_READ  |
>> EST_INFO_TS   |
>>
>> +---+-+++
>> | CLIENT 1608-CHUNK 237983037 ROWS 160746749821 BYTES PARALLEL 30-WAY
>> FULL SCAN OVER table  | 160746749821| 237983037  | 1545484493647  |
>> | SERVER FILTER BY d."p" = '123123123'
>>   | 160746749821| 237983037  | 1545484493647  |
>> | CLIENT MERGE SORT
>>| 160746749821| 237983037  |
>> 1545484493647  |
>>
>> +---+-+++
>> 3 rows selected (0.05 seconds)
>>
>>
>> 2. Indexes used if only 1 column selected
>>
>> 0: jdbc:phoenix:127.0.0.1> explain select "c" from table where "p" =
>> '123123123';
>>
>> +-+-+++
>> |
>> PLAN
>>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
>>
>> +-+-+++
>> | CLIENT 30-CHUNK 3569628 ROWS 3145729398 BYTES PARALLEL 30-WAY RANGE
>> SCAN OVER table_idx_p [0,'123123123'] - [29,'123123123']  | 3145729398
>> | 3569628| 1545484508039  |
>> | SERVER FILTER BY FIRST KEY ONLY
>>
>>  | 3145729398  | 3569628| 1545484508039  |
>> | CLIENT MERGE SORT
>>
>>  | 3145729398  | 3569628| 1545484508039  |
>>
>> +-+-+++
>> 3 rows selected (0.038 seconds)
>>
>>
>> 3.
>>
>> 0: jdbc:phoenix:127.0.0.1> explain select /*+ INDEX(table table_idx_p) */
>> * from table where "p" = '123123123';
>>
>> +-+-+++
>> |
>> PLAN
>>  | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS
>>  |
>>
>> +-+-+++
>> | CLIENT 1608-CHUNK 237983037 ROWS 160746749821 BYTES PARALLEL 30-WAY
>> FULL SCAN OVER table
>> | 3145729398  | 3569628| 1545484508039  |
>> | CLIENT MERGE SORT
>>
>>  | 3145729398  | 3569628|
>> 1545484508039  |
>> | SKIP-SCAN-JOIN TABLE 0
>>
>> | 3145729398  | 3569628| 1545484508039
>> |
>> | CLIENT 30-CHUNK 3569628 ROWS 3145729398 BYTES PARALLEL 30-WAY
>> RANGE SCAN OVER table_idx_p [0,'123123123'] - [29,'123123123']  |
>> 3145729398  

Re: "upsert select" with "limit" clause

2018-12-20 Thread Vincent Poon
Shawn, that's correct.  UPSERT SELECT with the limit does it in a single
thread, whereas without the limit it is done in parallel in multiple
threads.

Your CSV upsert will be faster because it doesn't need to SELECT - it
already has the data.
You can try increasing phoenix.mutate.maxSize
and phoenix.mutate.maxSizeBytes to see if that helps.

On Thu, Dec 20, 2018 at 10:41 AM Shawn Li  wrote:

> Hi Vincent,
>
> Thanks for checking the source code. You mentioned with limit UPSERT uses
> serial insert, can I interpret this as single thread process? So are you
> saying UPSERT without limit is faster is because it uses parallel
> insert/multi threads to insert? If so, then it makes sense now for both
> speed and memory usage.
>
> But the speed for UPSERT with limit is still pretty slow for us. The
> inserting rate is about 300 rows/sec vs 3000 rows/sec without limit. Also
> bulk csv data loading via psql is also use single thread (per phoenix site:
> "Single-threaded client loading tool for CSV formatted data via the psql
> <https://phoenix.apache.org/download.html#Loading-Data> command"), but
> the insert rate can be 3000-5000 rows/sec for us. Any other reason upsert
> with limit is so slow?
>
> Thanks,
> Shawn
>
> On Wed, Dec 19, 2018 at 5:04 PM Vincent Poon 
> wrote:
>
>> Shawn,
>>
>> Took a quick look, I think what is happening is the UPSERT is done
>> serially when you have LIMIT.
>> Parallel scans are issued for the SELECT, which is why the explain plan
>> shows PARALLEL, but then the results are concatenated via a single
>> LimitingResultIterator, in order to apply the CLIENT LIMIT.
>> The upsert then reads from that iterator and does the mutations in
>> batches.
>>
>> To insert in parallel, we would need some sort of shared state between
>> the writing threads to ensure we respect the limit, and I don't think we
>> currently have something like that.
>>
>> Vincent
>>
>> On Tue, Dec 18, 2018 at 2:31 PM Vincent Poon 
>> wrote:
>>
>>>
>>> Shawn, that sounds like a bug, I would file a JIRA.
>>>
>>> On Tue, Dec 18, 2018 at 12:33 PM Shawn Li  wrote:
>>>
>>>> Hi Vincent & William,
>>>>
>>>>
>>>>
>>>> Below is the explain plan, both are PARALLEL excuted in plan:
>>>>
>>>>
>>>>
>>>> explain upsert into table1 select * from table2;
>>>>
>>>>
>>>>
>>>> UPSERT
>>>> SELECT
>>>>   |
>>>>
>>>> CLIENT 27-CHUNK 915799 ROWS 2831155510 BYTES PARALLEL 18-WAY ROUND
>>>> ROBIN FULL SCAN OVER table2
>>>>
>>>>
>>>>
>>>> explain upsert into table1 select * from table2 limit 200;
>>>>
>>>>
>>>>
>>>> UPSERT
>>>> SELECT
>>>>   |
>>>>
>>>> | CLIENT 27-CHUNK 3600 ROWS 48114000 BYTES PARALLEL 18-WAY
>>>> ROUND ROBIN FULL SCAN OVER table2 |
>>>>
>>>> | SERVER 200 ROW
>>>> LIMIT
>>>> |
>>>>
>>>> | CLIENT 200 ROW LIMIT
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Shawn
>>>>
>>>> On Tue, Dec 18, 2018, 13:30 Vincent Poon >>>
>>>>> Shawn,
>>>>>
>>>>> Can you do an "explain" to show what your two statements are doing?
>>>>> That might give some clues.  Perhaps one is able to be run on the server
>>>>> for some reason and the other is not.
>>>>> Otherwise, I don't see why one would be substantially slower than the
>>>>> other.
>>>>>
>>>>> Vincent
>>>>>
>>>>> On Mon, Dec 17, 2018 at 9:14 PM Shawn Li  wrote:
>>>>>
>>>>>> Hi Jonathan,
>>>>>>
>>>>>> The single threaded on one side sounds logical to me. Hopefully
>>>>>> Vincent can confirm it.
>>>>>>
>>>>>> Thanks,
>>>>>> Shawn
>>>>>>
>>>>>> On Mon, Dec 17, 2018 at 9:25 PM Jonathan Leech 
>>>>>> wrote:
>>>>>>
>>>>>>> My guess is that in order to enforce the limit that it’s effectively
>>>>>>> single threaded in either the select or the upsert.
>&

Re: "upsert select" with "limit" clause

2018-12-19 Thread Vincent Poon
Shawn,

Took a quick look, I think what is happening is the UPSERT is done serially
when you have LIMIT.
Parallel scans are issued for the SELECT, which is why the explain plan
shows PARALLEL, but then the results are concatenated via a single
LimitingResultIterator, in order to apply the CLIENT LIMIT.
The upsert then reads from that iterator and does the mutations in batches.

To insert in parallel, we would need some sort of shared state between the
writing threads to ensure we respect the limit, and I don't think we
currently have something like that.

Vincent

On Tue, Dec 18, 2018 at 2:31 PM Vincent Poon  wrote:

>
> Shawn, that sounds like a bug, I would file a JIRA.
>
> On Tue, Dec 18, 2018 at 12:33 PM Shawn Li  wrote:
>
>> Hi Vincent & William,
>>
>>
>>
>> Below is the explain plan, both are PARALLEL excuted in plan:
>>
>>
>>
>> explain upsert into table1 select * from table2;
>>
>>
>>
>> UPSERT
>> SELECT
>>   |
>>
>> CLIENT 27-CHUNK 915799 ROWS 2831155510 BYTES PARALLEL 18-WAY ROUND ROBIN
>> FULL SCAN OVER table2
>>
>>
>>
>> explain upsert into table1 select * from table2 limit 200;
>>
>>
>>
>> UPSERT
>> SELECT
>>   |
>>
>> | CLIENT 27-CHUNK 3600 ROWS 48114000 BYTES PARALLEL 18-WAY ROUND
>> ROBIN FULL SCAN OVER table2 |
>>
>> | SERVER 200 ROW
>> LIMIT
>> |
>>
>> | CLIENT 200 ROW LIMIT
>>
>>
>>
>>
>>
>>
>>
>> Thanks,
>>
>> Shawn
>>
>> On Tue, Dec 18, 2018, 13:30 Vincent Poon >
>>> Shawn,
>>>
>>> Can you do an "explain" to show what your two statements are doing?
>>> That might give some clues.  Perhaps one is able to be run on the server
>>> for some reason and the other is not.
>>> Otherwise, I don't see why one would be substantially slower than the
>>> other.
>>>
>>> Vincent
>>>
>>> On Mon, Dec 17, 2018 at 9:14 PM Shawn Li  wrote:
>>>
>>>> Hi Jonathan,
>>>>
>>>> The single threaded on one side sounds logical to me. Hopefully Vincent
>>>> can confirm it.
>>>>
>>>> Thanks,
>>>> Shawn
>>>>
>>>> On Mon, Dec 17, 2018 at 9:25 PM Jonathan Leech 
>>>> wrote:
>>>>
>>>>> My guess is that in order to enforce the limit that it’s effectively
>>>>> single threaded in either the select or the upsert.
>>>>>
>>>>> On Dec 17, 2018, at 6:43 PM, Shawn Li  wrote:
>>>>>
>>>>> Hi Vincent,
>>>>>
>>>>> Thanks for explaining. That makes much more sense now and it explains
>>>>> the high memory usage when without "limit" clause. Because it upserts much
>>>>> quickly when using "upsert select" without "limit", the memory usage in
>>>>> client machine is much higher than "upsert select" with "limit" .
>>>>>
>>>>> So back to the other question. Can you explain what is underlying
>>>>> Phoenix implementation for "upsert select limit"? Why it is slower than
>>>>> without "limit" when insert a huge number (2m rows) like ""upsert into
>>>>> table2 select * from table1 limit 2,000,000;". This is much slower than
>>>>> inserting the whole table (upsert into table2 select * from table1;).
>>>>>
>>>>> Thanks,
>>>>> Xiang
>>>>>
>>>>>
>>>>> On Mon, Dec 17, 2018 at 1:56 PM Vincent Poon 
>>>>> wrote:
>>>>>
>>>>>> Shawn,
>>>>>> Your query *upsert into table2 select * from table1;  *would not be
>>>>>> run on the server - the source and target table are different.  It would
>>>>>> have to be something like:
>>>>>> *upsert into table1 select * from table1;*
>>>>>>
>>>>>> If you want to run server-side upsert select on a target table that
>>>>>> is different from the source table, you need to set
>>>>>> "phoenix.client.enable.server.upsert.select" to true on your client.
>>>>>> The are some other restrictions: the table can't have any global
>>>>>> indexes, and the statement can't have a join or where subquery.  We need 
>>>>>> to

Re: "upsert select" with "limit" clause

2018-12-18 Thread Vincent Poon
Shawn, that sounds like a bug, I would file a JIRA.

On Tue, Dec 18, 2018 at 12:33 PM Shawn Li  wrote:

> Hi Vincent & William,
>
>
>
> Below is the explain plan, both are PARALLEL excuted in plan:
>
>
>
> explain upsert into table1 select * from table2;
>
>
>
> UPSERT
> SELECT
>   |
>
> CLIENT 27-CHUNK 915799 ROWS 2831155510 BYTES PARALLEL 18-WAY ROUND ROBIN
> FULL SCAN OVER table2
>
>
>
> explain upsert into table1 select * from table2 limit 200;
>
>
>
> UPSERT
> SELECT
>   |
>
> | CLIENT 27-CHUNK 3600 ROWS 48114000 BYTES PARALLEL 18-WAY ROUND
> ROBIN FULL SCAN OVER table2 |
>
> | SERVER 200 ROW
> LIMIT
>         |
>
> | CLIENT 200 ROW LIMIT
>
>
>
>
>
>
>
> Thanks,
>
> Shawn
>
> On Tue, Dec 18, 2018, 13:30 Vincent Poon 
>> Shawn,
>>
>> Can you do an "explain" to show what your two statements are doing?  That
>> might give some clues.  Perhaps one is able to be run on the server for
>> some reason and the other is not.
>> Otherwise, I don't see why one would be substantially slower than the
>> other.
>>
>> Vincent
>>
>> On Mon, Dec 17, 2018 at 9:14 PM Shawn Li  wrote:
>>
>>> Hi Jonathan,
>>>
>>> The single threaded on one side sounds logical to me. Hopefully Vincent
>>> can confirm it.
>>>
>>> Thanks,
>>> Shawn
>>>
>>> On Mon, Dec 17, 2018 at 9:25 PM Jonathan Leech 
>>> wrote:
>>>
>>>> My guess is that in order to enforce the limit that it’s effectively
>>>> single threaded in either the select or the upsert.
>>>>
>>>> On Dec 17, 2018, at 6:43 PM, Shawn Li  wrote:
>>>>
>>>> Hi Vincent,
>>>>
>>>> Thanks for explaining. That makes much more sense now and it explains
>>>> the high memory usage when without "limit" clause. Because it upserts much
>>>> quickly when using "upsert select" without "limit", the memory usage in
>>>> client machine is much higher than "upsert select" with "limit" .
>>>>
>>>> So back to the other question. Can you explain what is underlying
>>>> Phoenix implementation for "upsert select limit"? Why it is slower than
>>>> without "limit" when insert a huge number (2m rows) like ""upsert into
>>>> table2 select * from table1 limit 2,000,000;". This is much slower than
>>>> inserting the whole table (upsert into table2 select * from table1;).
>>>>
>>>> Thanks,
>>>> Xiang
>>>>
>>>>
>>>> On Mon, Dec 17, 2018 at 1:56 PM Vincent Poon 
>>>> wrote:
>>>>
>>>>> Shawn,
>>>>> Your query *upsert into table2 select * from table1;  *would not be
>>>>> run on the server - the source and target table are different.  It would
>>>>> have to be something like:
>>>>> *upsert into table1 select * from table1;*
>>>>>
>>>>> If you want to run server-side upsert select on a target table that is
>>>>> different from the source table, you need to set
>>>>> "phoenix.client.enable.server.upsert.select" to true on your client.
>>>>> The are some other restrictions: the table can't have any global
>>>>> indexes, and the statement can't have a join or where subquery.  We need 
>>>>> to
>>>>> update the documentation with this information.
>>>>>
>>>>> The reason there are all these hurdles is because it's generally not
>>>>> recommended to do server-side upsert select across different tables,
>>>>> because that means you're doing cross-regionserver RPCs (e.g. read data
>>>>> from a region of sourcetable, and write to a region of targettable on a
>>>>> different regionserver), potentially tying up handlers in an unpredictable
>>>>> way.
>>>>>
>>>>> On Sun, Dec 16, 2018 at 7:12 PM Shawn Li  wrote:
>>>>>
>>>>>> Hi Jaanai,
>>>>>>
>>>>>> According to Phoenix website, " If auto commit is on, and both a)
>>>>>> the target table matches the source table, and b) the select performs no
>>>>>> aggregation, then the population of the target table will be done
>>>>>> completely on the server-

Re: "upsert select" with "limit" clause

2018-12-18 Thread Vincent Poon
Shawn,

Can you do an "explain" to show what your two statements are doing?  That
might give some clues.  Perhaps one is able to be run on the server for
some reason and the other is not.
Otherwise, I don't see why one would be substantially slower than the other.

Vincent

On Mon, Dec 17, 2018 at 9:14 PM Shawn Li  wrote:

> Hi Jonathan,
>
> The single threaded on one side sounds logical to me. Hopefully Vincent
> can confirm it.
>
> Thanks,
> Shawn
>
> On Mon, Dec 17, 2018 at 9:25 PM Jonathan Leech  wrote:
>
>> My guess is that in order to enforce the limit that it’s effectively
>> single threaded in either the select or the upsert.
>>
>> On Dec 17, 2018, at 6:43 PM, Shawn Li  wrote:
>>
>> Hi Vincent,
>>
>> Thanks for explaining. That makes much more sense now and it explains the
>> high memory usage when without "limit" clause. Because it upserts much
>> quickly when using "upsert select" without "limit", the memory usage in
>> client machine is much higher than "upsert select" with "limit" .
>>
>> So back to the other question. Can you explain what is underlying Phoenix
>> implementation for "upsert select limit"? Why it is slower than without
>> "limit" when insert a huge number (2m rows) like ""upsert into table2
>> select * from table1 limit 2,000,000;". This is much slower than inserting
>> the whole table (upsert into table2 select * from table1;).
>>
>> Thanks,
>> Xiang
>>
>>
>> On Mon, Dec 17, 2018 at 1:56 PM Vincent Poon 
>> wrote:
>>
>>> Shawn,
>>> Your query *upsert into table2 select * from table1;  *would not be run
>>> on the server - the source and target table are different.  It would have
>>> to be something like:
>>> *upsert into table1 select * from table1;*
>>>
>>> If you want to run server-side upsert select on a target table that is
>>> different from the source table, you need to set
>>> "phoenix.client.enable.server.upsert.select" to true on your client.
>>> The are some other restrictions: the table can't have any global
>>> indexes, and the statement can't have a join or where subquery.  We need to
>>> update the documentation with this information.
>>>
>>> The reason there are all these hurdles is because it's generally not
>>> recommended to do server-side upsert select across different tables,
>>> because that means you're doing cross-regionserver RPCs (e.g. read data
>>> from a region of sourcetable, and write to a region of targettable on a
>>> different regionserver), potentially tying up handlers in an unpredictable
>>> way.
>>>
>>> On Sun, Dec 16, 2018 at 7:12 PM Shawn Li  wrote:
>>>
>>>> Hi Jaanai,
>>>>
>>>> According to Phoenix website, " If auto commit is on, and both a) the
>>>> target table matches the source table, and b) the select performs no
>>>> aggregation, then the population of the target table will be done
>>>> completely on the server-side (with constraint violations logged, but
>>>> otherwise ignored). Otherwise, data is buffered on the client and, if auto
>>>> commit is on, committed in row batches as specified by the UpsertBatchSize
>>>> connection property (or the phoenix.mutate.upsertBatchSize HBase config
>>>> property which defaults to 1 rows)"
>>>>
>>>> And our sql statement is just: *upsert into table2 select * from
>>>> table1; *which should match the first case, all operations should be
>>>> in server site. But the memory usage on the client machine is higher than
>>>> "upsert select limit" clause. And the memory usage is check by run 'top'
>>>> command under Linux. So we are sure it is caused by "select upsert" in
>>>> Phoenix and not others, and can't explain why there is so high memory usage
>>>> on client/gateway machine when all operations are supposed to happen on the
>>>> serve side.
>>>>
>>>> Thanks,
>>>> Shawn
>>>>
>>>> On Thu, Dec 13, 2018 at 3:15 AM Jaanai Zhang 
>>>> wrote:
>>>>
>>>>> Shawn,
>>>>>
>>>>> The UPSERT SELECT will run in a coprocessor on if it hasn't limit
>>>>> clause, only query 1 table, the query is doing aggregation, no sequences
>>>>> and auto commit is on. Please check your SQL ... and you can also check
>>>&

Re: "upsert select" with "limit" clause

2018-12-17 Thread Vincent Poon
Shawn,
Your query *upsert into table2 select * from table1;  *would not be run on
the server - the source and target table are different.  It would have to
be something like:
*upsert into table1 select * from table1;*

If you want to run server-side upsert select on a target table that is
different from the source table, you need to set
"phoenix.client.enable.server.upsert.select" to true on your client.
The are some other restrictions: the table can't have any global indexes,
and the statement can't have a join or where subquery.  We need to update
the documentation with this information.

The reason there are all these hurdles is because it's generally not
recommended to do server-side upsert select across different tables,
because that means you're doing cross-regionserver RPCs (e.g. read data
from a region of sourcetable, and write to a region of targettable on a
different regionserver), potentially tying up handlers in an unpredictable
way.

On Sun, Dec 16, 2018 at 7:12 PM Shawn Li  wrote:

> Hi Jaanai,
>
> According to Phoenix website, " If auto commit is on, and both a) the
> target table matches the source table, and b) the select performs no
> aggregation, then the population of the target table will be done
> completely on the server-side (with constraint violations logged, but
> otherwise ignored). Otherwise, data is buffered on the client and, if auto
> commit is on, committed in row batches as specified by the UpsertBatchSize
> connection property (or the phoenix.mutate.upsertBatchSize HBase config
> property which defaults to 1 rows)"
>
> And our sql statement is just: *upsert into table2 select * from table1; 
> *which
> should match the first case, all operations should be in server site. But
> the memory usage on the client machine is higher than "upsert select limit"
> clause. And the memory usage is check by run 'top'  command under Linux. So
> we are sure it is caused by "select upsert" in Phoenix and not others, and
> can't explain why there is so high memory usage on client/gateway machine
> when all operations are supposed to happen on the serve side.
>
> Thanks,
> Shawn
>
> On Thu, Dec 13, 2018 at 3:15 AM Jaanai Zhang 
> wrote:
>
>> Shawn,
>>
>> The UPSERT SELECT will run in a coprocessor on if it hasn't limit clause,
>> only query 1 table, the query is doing aggregation, no sequences and auto
>> commit is on. Please check your SQL ... and you can also check whether some
>> resources have not been released.
>>
>> 
>>Jaanai Zhang
>>Best regards!
>>
>>
>>
>> Shawn Li  于2018年12月13日周四 下午12:10写道:
>>
>>> Hi Jaanai,
>>>
>>> Thanks for putting your thought. The behavior you describe is correct on
>>> the Hbase region sever side. The memory usage for blockcache and memstore
>>> will be high under such high throughput. But our phoenix client is on a
>>> gateway machine (no hbase region server sitting on it or any Hbase service
>>> on it), so not sure how to explain such high memory usage for upsert select
>>> without "limit" clause. The high memory usage behavior like all select
>>> results send to client machine, cached in client machine's memory, and then
>>> insert back to target table, which is not like the behavior that should
>>> happen, all of this should be done on the server side as the table schema
>>> is exactly the same. By the way, this happens on both Phoenix 4.7 and
>>> Phoenix 4.14.
>>>
>>>
>>> Thanks,
>>> Shawn
>>>
>>> On Wed, Dec 12, 2018 at 10:26 PM Jaanai Zhang 
>>> wrote:
>>>
>>>> Shawn,
>>>>
>>>>
>>>> For the upsert without limit,  which will read the source table and
>>>> write the target tables on the server side.  I think the higher memory
>>>> usage is caused by using scan cache and memstore under the higher
>>>> throughput.
>>>>
>>>> 
>>>>Jaanai Zhang
>>>>Best regards!
>>>>
>>>>
>>>>
>>>> Shawn Li  于2018年12月13日周四 上午10:13写道:
>>>>
>>>>> Hi Vincent,
>>>>>
>>>>> So you describe limit will sent result to client side and then write
>>>>> to server, this might explain why upsert with limit is slower than without
>>>>> limit. But looks like it can't explain the memory usage? The memory usage
>>>>> on client machine is 8gb (without "limit") vs 2gb (with limit), sometime
>>>>&

Re: "upsert select" with "limit" clause

2018-12-12 Thread Vincent Poon
I think it's done client-side if you have LIMIT.  If you have e.g. LIMIT
1000 , it would be incorrect for each regionserver to upsert 100, if you
have more than one regionserver.  So instead results are sent back to the
client, where the LIMIT is applied and then written back to the server in
the UPSERT.

On Wed, Dec 12, 2018 at 1:18 PM Shawn Li  wrote:

> Hi Vincent,
>
>
>
> The table creation statement is similar to below. We have about 200
> fields. Table is mutable and don’t have any index on the table.
>
>
>
> CREATE TABLE IF NOT EXISTS us_population (
>
>   state CHAR(2) NOT NULL,
>
>   city VARCHAR,
>
>   population BIGINT,
>
>   …
>
>   CONSTRAINT my_pk PRIMARY KEY (state));
>
>
>
> Thanks,
>
> Shawn
>
>
>
> On Wed, Dec 12, 2018, 13:42 Vincent Poon 
>> For #2, can you provide the table definition and the statement used?
>> e.g. Is the table immutable, or does it have indexes?
>>
>> On Tue, Dec 11, 2018 at 6:08 PM Shawn/Xiang Li 
>> wrote:
>>
>>> Hi,
>>>
>>>
>>>
>>> 1.   Want to check what is underlying running for limit clause used
>>> in the following Upsert statement (is it involving any coprocessor working
>>> behind?):
>>>
>>>
>>>
>>> *  upsert into table2 select * from
>>> table1 limit 300; * (table 1 and table 2 have same schema)
>>>
>>>
>>>
>>>   The above statement is running a lot slower than without
>>> “limit”  clause as shown in following, even the above statement upsert less
>>> data:
>>>
>>>
>>>
>>> *upsert into table2 select * from
>>> table1;*
>>>
>>>
>>>
>>> 2.   We also observe memory usable is pretty high without the limit
>>> clause (8gb vs 2gb), sometimes for big table it can reach 20gb without
>>> using limit clause.  According to phoenix website description for upsert
>>> select “If auto commit is on, and both a) the target table matches the
>>> source table, and b) the select performs no aggregation, then the
>>> population of the target table will be done completely on the server-side
>>> (with constraint violations logged, but otherwise ignored).”
>>>
>>>
>>>
>>>My question is If everything is done on server-side, how
>>> come we have such high memory usage on the client machine?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Shawn
>>>
>>


Re: "upsert select" with "limit" clause

2018-12-12 Thread Vincent Poon
For #2, can you provide the table definition and the statement used?  e.g.
Is the table immutable, or does it have indexes?

On Tue, Dec 11, 2018 at 6:08 PM Shawn/Xiang Li  wrote:

> Hi,
>
>
>
> 1.   Want to check what is underlying running for limit clause used
> in the following Upsert statement (is it involving any coprocessor working
> behind?):
>
>
>
> *  upsert into table2 select * from table1
> limit 300; * (table 1 and table 2 have same schema)
>
>
>
>   The above statement is running a lot slower than without
> “limit”  clause as shown in following, even the above statement upsert less
> data:
>
>
>
> *upsert into table2 select * from table1;*
>
>
>
> 2.   We also observe memory usable is pretty high without the limit
> clause (8gb vs 2gb), sometimes for big table it can reach 20gb without
> using limit clause.  According to phoenix website description for upsert
> select “If auto commit is on, and both a) the target table matches the
> source table, and b) the select performs no aggregation, then the
> population of the target table will be done completely on the server-side
> (with constraint violations logged, but otherwise ignored).”
>
>
>
>My question is If everything is done on server-side, how
> come we have such high memory usage on the client machine?
>
>
>
> Thanks,
>
> Shawn
>


[ANNOUNCE] Apache Phoenix 4.14.1 released

2018-11-14 Thread Vincent Poon
The Apache Phoenix team is pleased to announce the immediate availability
of the 4.14.1 patch release. Apache Phoenix enables SQL-based OLTP and
operational analytics for Apache Hadoop using Apache HBase as its backing
store and providing integration with other projects in the Apache ecosystem
such as Spark, Hive, Pig, Flume, and MapReduce.

This patch release has feature parity with supported HBase versions and
includes critical bug fixes for secondary indexes.

Download source and binaries here [1].

Thanks,
Vincent (on behalf of the Apache Phoenix team)

[1] http://phoenix.apache.org/download.html


Re: ABORTING region server and following HBase cluster "crash"

2018-11-02 Thread Vincent Poon
Indexes in Phoenix should not in theory cause any cluster outage.  An index
write failure should just disable the index, not cause a crash.
In practice, there have been some bugs around race conditions, the most
dangerous of which accidentally trigger a KillServerOnFailurePolicy which
then potentially cascades.
That policy is there for legacy reasons, I believe because at the time that
was the only way to keep indexes consistent - kill the RS and replay from
WAL.
There is now a partial rebuilder which detects when an index has been
disabled due to a write failure, and asynchronously attempts to rebuild the
index.  Killing the RS is supposed to be a last ditch effort only if the
index could not be disabled (because otherwise, your index is out of sync
but still active and your queries will return incorrect results).
PHOENIX-4977 made the policy configurable now.  If you would rather, in the
worst case, have your index potentially get out of sync instead of killing
RSs, you can set that to LeaveIndexActiveFailurePolicy.

On Fri, Nov 2, 2018 at 5:14 PM Neelesh  wrote:

> By no means am I judging Phoenix based on this. This is simply a design
> trade-off (scylladb goes the same route and builds global indexes). I
> appreciate all the effort that has gone in to Phoenix, and it was indeed a
> life saver. But the technical point remains that single node failures have
> potential to cascade to the entire cluster. That's the nature of global
> indexes, not specific to phoenix.
>
> I apologize if my response came off as dismissing phoenix altogether.
> FWIW, I'm a big advocate of phoenix at my org internally, albeit for the
> newer version.
>
>
> On Fri, Nov 2, 2018, 4:09 PM Josh Elser  wrote:
>
>> I would strongly disagree with the assertion that this is some
>> unavoidable problem. Yes, an inverted index is a data structure which,
>> by design, creates a hotspot (phrased another way, this is "data
>> locality").
>>
>> Lots of extremely smart individuals have spent a significant amount of
>> time and effort in stabilizing secondary indexes in the past 1-2 years,
>> not to mention others spending time on a local index implementation.
>> Judging Phoenix in its entirety based off of an arbitrarily old version
>> of Phoenix is disingenuous.
>>
>> On 11/2/18 2:00 PM, Neelesh wrote:
>> > I think this is an unavoidable problem in some sense, if global indexes
>> > are used. Essentially global indexes create a  graph of dependent
>> region
>> > servers due to index rpc calls from one RS to another. Any single
>> > failure is bound to affect the entire graph, which under reasonable
>> load
>> > becomes the entire HBase cluster. We had to drop global indexes just to
>> > keep the cluster running for more than a few days.
>> >
>> > I think Cassandra has local secondary indexes preciesly because of this
>> > issue. Last I checked there were significant pending improvements
>> > required for Phoenix local indexes, especially around read paths ( not
>> > utilizing primary key prefixes in secondary index reads where possible,
>> > for example)
>> >
>> >
>> > On Thu, Sep 13, 2018, 8:12 PM Jonathan Leech > > > wrote:
>> >
>> > This seems similar to a failure scenario I’ve seen a couple times. I
>> > believe after multiple restarts you got lucky and tables were
>> > brought up by Hbase in the correct order.
>> >
>> > What happens is some kind of semi-catastrophic failure where 1 or
>> > more region servers go down with edits that weren’t flushed, and are
>> > only in the WAL. These edits belong to regions whose tables have
>> > secondary indexes. Hbase wants to replay the WAL before bringing up
>> > the region server. Phoenix wants to talk to the index region during
>> > this, but can’t. It fails enough times then stops.
>> >
>> > The more region servers / tables / indexes affected, the more likely
>> > that a full restart will get stuck in a classic deadlock. A good
>> > old-fashioned data center outage is a great way to get started with
>> > this kind of problem. You might make some progress and get stuck
>> > again, or restart number N might get those index regions initialized
>> > before the main table.
>> >
>> > The sure fire way to recover a cluster in this condition is to
>> > strategically disable all the tables that are failing to come up.
>> > You can do this from the Hbase shell as long as the master is
>> > running. If I remember right, it’s a pain since the disable command
>> > will hang. You might need to disable a table, kill the shell,
>> > disable the next table, etc. Then restart. You’ll eventually have a
>> > cluster with all the region servers finally started, and a bunch of
>> > disabled regions. If you disabled index tables, enable one, wait for
>> > it to become available; eg its WAL edits will be replayed, then
>> > enable the associated main table and wait for it to come online. If

Re: ON DUPLICATE KEY with Global Index

2018-10-09 Thread Vincent Poon
We do need to update the docs after PHOENIX-3925, which changed the
behavior from 'recommended' to 'mandatory'.
I'll update the docs now.

On Tue, Oct 9, 2018 at 1:08 PM Ankit Singhal 
wrote:

> We do not allow atomic upsert and throw the corresponding exception in the
> cases documented under the limitations section of
> http://phoenix.apache.org/atomic_upsert.html.  Probably a documentation
> needs a little touch to convey this clearly.
>
> On Tue, Oct 9, 2018 at 10:05 AM Josh Elser  wrote:
>
>> Can you elaborate on what is unclear about the documentation? This
>> exception and the related documentation read as being in support of each
>> other to me.
>>
>> On 10/9/18 5:39 AM, Batyrshin Alexander wrote:
>> >   Hello all,
>> > Documentations (http://phoenix.apache.org/atomic_upsert.html) say:
>> >
>> > "Although global indexes on columns being atomically updated are
>> supported, it’s not recommended as a potentially a separate RPC across the
>> wire would be made while the row is under lock to maintain the secondary
>> index."
>> >
>> > But in practice we get:
>> > CANNOT_USE_ON_DUP_KEY_WITH_GLOBAL_IDX(1224, "42Z24", "The ON DUPLICATE
>> KEY clause may not be used when a table has a global index." )
>> >
>> > Is this bug or documentation is outdated?
>> >
>>
>


Re: Table dead lock: ERROR 1120 (XCL20): Writes to table blocked until index can be updated

2018-09-26 Thread Vincent Poon
We are planning a Phoenix 4.14.1 release which will have this fix

On Wed, Sep 26, 2018 at 3:36 PM Batyrshin Alexander <0x62...@gmail.com>
wrote:

> Thank you. We will try somehow...
> Is there any chance that this fix will be included in next release for
> HBASE-1.4 (not 2.0)?
>
> On 27 Sep 2018, at 01:04, Ankit Singhal  wrote:
>
> You might be hitting PHOENIX-4785
> ,  you can apply the
> patch on top of 4.14 and see if it fixes your problem.
>
> Regards,
> Ankit Singhal
>
> On Wed, Sep 26, 2018 at 2:33 PM Batyrshin Alexander <0x62...@gmail.com>
> wrote:
>
>> Any advices? Helps?
>> I can reproduce problem and capture more logs if needed.
>>
>> On 21 Sep 2018, at 02:13, Batyrshin Alexander <0x62...@gmail.com> wrote:
>>
>> Looks like lock goes away 30 minutes after index region split.
>> So i can assume that this issue comes from cache that configured by this
>> option:* phoenix.coprocessor.maxMetaDataCacheTimeToLiveMs*
>>
>>
>>
>> On 21 Sep 2018, at 00:15, Batyrshin Alexander <0x62...@gmail.com> wrote:
>>
>> And how this split looks at Master logs:
>>
>> Sep 20 19:45:04 prod001 hbase[10838]: 2018-09-20 19:45:04,888 INFO
>>  [AM.ZK.Worker-pool5-t282] master.RegionStates: Transition
>> {3e44b85ddf407da831dbb9a871496986 state=OPEN,
>> ts=1537304859509, server=prod013,60020,1537304282885} to
>> {3e44b85ddf407da831dbb9a871496986 state=SPLITTING, ts=1537461904888,
>> server=prod
>> Sep 20 19:45:05 prod001 hbase[10838]: 2018-09-20 19:45:05,340 INFO
>>  [AM.ZK.Worker-pool5-t284] master.RegionStates: Transition
>> {3e44b85ddf407da831dbb9a871496986 state=SPLITTING, ts=1537461905340,
>> server=prod013,60020,1537304282885} to {3e44b85ddf407da831dbb9a871496986
>> state=SPLIT, ts=1537461905340, server=pro
>> Sep 20 19:45:05 prod001 hbase[10838]: 2018-09-20 19:45:05,340 INFO
>>  [AM.ZK.Worker-pool5-t284] master.RegionStates: Offlined
>> 3e44b85ddf407da831dbb9a871496986 from prod013,60020,1537304282885
>> Sep 20 19:45:05 prod001 hbase[10838]: 2018-09-20 19:45:05,341 INFO
>>  [AM.ZK.Worker-pool5-t284] master.RegionStates: Transition
>> {33cba925c7acb347ac3f5e70e839c3cb state=SPLITTING_NEW, ts=1537461905340,
>> server=prod013,60020,1537304282885} to {33cba925c7acb347ac3f5e70e839c3cb
>> state=OPEN, ts=1537461905341, server=
>> Sep 20 19:45:05 prod001 hbase[10838]: 2018-09-20 19:45:05,341 INFO
>>  [AM.ZK.Worker-pool5-t284] master.RegionStates: Transition
>> {acb8f16a004a894c8706f6e12cd26144 state=SPLITTING_NEW, ts=1537461905340,
>> server=prod013,60020,1537304282885} to {acb8f16a004a894c8706f6e12cd26144
>> state=OPEN, ts=1537461905341, server=
>> Sep 20 19:45:05 prod001 hbase[10838]: 2018-09-20 19:45:05,343 INFO
>>  [AM.ZK.Worker-pool5-t284] master.AssignmentManager: Handled SPLIT
>> event; 
>> parent=IDX_MARK_O,\x107834005168\x46200020LWfBS4c,1536637905252.3e44b85ddf407da831dbb9a871496986.,
>> daughter a=IDX_MARK_O,\x107834005168\x46200020LWfBS4c,1
>> Sep 20 19:47:41 prod001 hbase[10838]: 2018-09-20 19:47:41,972 INFO
>>  [prod001,6,1537304851459_ChoreService_2]
>> balancer.StochasticLoadBalancer: Skipping load balancing because balanced
>> cluster; total cost is 17.82282205608522, sum multiplier is 1102.0 min cost
>> which need balance is 0.05
>> Sep 20 19:47:42 prod001 hbase[10838]: 2018-09-20 19:47:42,021 INFO
>>  [prod001,6,1537304851459_ChoreService_1] hbase.MetaTableAccessor:
>> Deleted 
>> IDX_MARK_O,\x107834005168\x46200020LWfBS4c,1536637905252.3e44b85ddf407da831dbb9a871496986.
>> Sep 20 19:47:42 prod001 hbase[10838]: 2018-09-20 19:47:42,022 INFO
>>  [prod001,6,1537304851459_ChoreService_1] master.CatalogJanitor:
>> Scanned 779 catalog row(s), gc'd 0 unreferenced merged region(s) and 1
>> unreferenced parent region(s)
>>
>> On 20 Sep 2018, at 21:43, Batyrshin Alexander <0x62...@gmail.com> wrote:
>>
>> Looks like problem was because of index region split
>>
>> Index region split at prod013:
>> Sep 20 19:45:05 prod013 hbase[193055]: 2018-09-20 19:45:05,441 INFO
>>  [regionserver/prod013/10.0.0.13:60020-splits-1537400010677]
>> regionserver.SplitRequest: Region split, hbase:meta updated, and report to
>> master.
>> Parent=IDX_MARK_O,\x107834005168\x46200020LWfBS4c,1536637905252.3e44b85ddf407da831dbb9a871496986.,
>> new
>> regions: 
>> IDX_MARK_O,\x107834005168\x46200020LWfBS4c,1537461904877.33cba925c7acb347ac3f5e70e839c3cb.,
>>  
>> IDX_MARK_O,\x107834005168\x46200068=4YF!YI,1537461904877.acb8f16a004a894c8706f6e12cd26144..
>> Split took 0sec
>> Sep 20 19:45:05 prod013 hbase[193055]: 2018-09-20 19:45:05,441 INFO
>>  [regionserver/prod013/10.0.0.13:60020-splits-1537400010677]
>> regionserver.SplitRequest: Split transaction journal:
>> Sep 20 19:45:05 prod013 hbase[193055]: STARTED at 1537461904853
>> Sep 20 19:45:05 prod013 hbase[193055]: PREPARED at 1537461904877
>> Sep 20 19:45:05 prod013 hbase[193055]: BEFORE_PRE_SPLIT_HOOK at
>> 1537461904877
>> Sep 20 19:45:05 prod013 hbase[193055]: