Re: Moving Phoenix master to Hbase 2.2

2020-01-12 Thread István Tóth
Hi!

(Sorry for not replying earlier, I'll need to revisit my mail filters)

I do have a half-baked patch with the lightweight isolation in modules case.

In its current form it's more ugly than complex, but some ugliness is  a
small price to pay for not having to maintain multiple branches. (And it
can be beutified later)

I will complete it, and make the PR for that as time permits.

The tephra solution handles the version differences during run (startup)
time, my current solution is compile-time.

Istvan




On Fri, Dec 20, 2019 at 11:05 AM la...@apache.org  wrote:

>  Yep that.
> For the record... I do not think this is simple. But it is possible.
>
> On Thursday, December 19, 2019, 8:37:37 PM GMT+1, Andrew Purtell <
> apurt...@apache.org> wrote:
>
>  I can't answer for Lars but whenever version incompatibilities come up
> usually only a handful of files are impacted. In the last round, the
> Phoenix access controller, a related file in the same directory, and the
> RPC scheduler. If you cloned these into separate version specific maven
> modules case by case as needed at each round the differences are fairly
> small. On the other hand if you take a principled approach and abstract all
> the things, it will be a huge effort and nobody realistically will want to
> take it on.
>
> On Thu, Dec 19, 2019 at 11:34 AM Geoffrey Jacoby 
> wrote:
>
> > Lars,
> >
> > I'm curious why you say the differences are easily isolated -- many of
> the
> > core classes of Phoenix either directly inherit HBase classes or
> implement
> > HBase interfaces, and those can vary between minor versions. (See my
> above
> > example of a new coprocessor hook on BaseRegionObserver.)
> >
> > Geoffrey
> >
> > On Thu, Dec 19, 2019 at 10:54 AM la...@apache.org 
> > wrote:
> >
> > >  Yep. The differences are pretty minimal - provided they can be
> isolated
> > > easily.
> > > Tephra might be a pretty good model. It supports various versions of
> > HBase
> > > in a single branch and has similar issues as Phoenix (coprocessors,
> etc).
> > > -- Lars
> > >On Thursday, December 19, 2019, 7:07:51 PM GMT+1, Josh Elser <
> > > els...@apache.org> wrote:
> > >
> > >  To clarify, you think that compat modules are better than that
> > > separate-branches model in 4.x?
> > >
> > > On 12/18/19 11:29 AM, la...@apache.org wrote:
> > > > This is really hard to follow.
> > > >
> > > > I think we should do the same with HBase dependencies in Phoenix that
> > > HBase does with Hadoop dependencies.
> > > >
> > > > That is:  We could have a maven module with the specific HBase
> version
> > > dependent code.
> > > > Btw. Tephra does the same... A module for HBase version specific
> code.
> > > > -- Lars
> > > >
> > > >  On Tuesday, December 17, 2019, 10:00:31 AM GMT+1, Istvan Toth <
> > > st...@apache.org> wrote:
> > > >
> > > >  What do you think about tying the minor releases to Hbase minor
> > releases
> > > > (not necessarily one-to-one)
> > > >
> > > > for example (provided 5.1 is 2020H1)
> > > >
> > > > 5.0.0 -> HB 2.0
> > > > 5.1.0 -> HB 2.2.2 (and whatever 2.1 is API compatible with it)
> > > > 5.1.x -> HB 2.2.x (treat as maintenance branch, no major new
> features)
> > > > 5.2.0 -> HB 2.3.0 (if released by that time)
> > > > 5.2.x -> HB 2.3.x (treat as maintenance branch, no major new
> features)
> > > > 5.3.0 -> HB 2.3.x (if there is no new major/minor Hbase release)
> > > > master -> latest released HBase version
> > > >
> > > > Alternatively, we could stick with the same HBase version for patch
> > > > releases that we used for the first minor release.
> > > >
> > > > This would limit the number of branches that we have to maintain in
> > > > parallel, while providing maintenance branches for older releases,
> and
> > > > timely-ish Phoenix releases.
> > > >
> > > > The drawback is that users of old HBase versions won't get the latest
> > > > features, on the other hand they can expect more polish.
> > > >
> > > > Istvan
> > > >
> > > > On Thu, Dec 12, 2019 at 8:05 PM Geoffrey Jacoby 
> > > wrote:
> > > >
> > > >> Since HBase 2.0 is EOM'ed, I'm +1 for not worrying about 2.0.x
> > > >> compatibility with the 5.x branch going forward.
> > > >>
> > > >> Given how coupled Phoenix is to the implementation details of HBase
> > > though,
> > > >> I'm not sure trying to abstract those away to keep one Phoenix
> branch
> > > per
> > > >> HBase major version is practical, however. At the least, it would be
> > > really
> > > >> complex.
> > > >>
> > > >> For example, in the new year I plan to return to working on the
> change
> > > data
> > > >> capture and Phoenix-level replication features, both of which depend
> > on
> > > >> WALKey interface changes and a new RegionObserver coprocessor hook
> > > >> introduced in HBASE-22622 and HBASE-22623. This was released in
> HBase
> > > 1.5
> > > >> and will be in the forthcoming HBase 2.3. While the HBase community
> is
> > > >> discussing EOMing 1.3 right now, and maybe 1.4 will go in the medium
> > 

Committers please look at the Phoenix tests and fix your failures

2020-01-12 Thread la...@apache.org
... Not much else to say here...
The tests have been failing again for a while... I will NOT fix them again this 
time! Sorry folks.

-- Lars



Re: Python2 EOL

2020-01-12 Thread la...@apache.org
 Heh.
They used to be shell scripts and then we converted them to Python.Personally I 
was not a fan of that back then, but anyway.
In any case there's some work to do.

-- Lars

On Friday, January 10, 2020, 7:55:43 AM PST, Josh Elser  
wrote:  
 
 I think converting them to Bash is the right thing to do. We're not 
doing anything fancy.

On 1/9/20 5:10 PM, Andrew Purtell wrote:
> Some of the python scripts are glorified shell scripts and could be
> rewritten as such, such as the launch scripts for psql and sqlline and the
> pqs. I get that python is and was trendier than bash but sometimes the
> right tool for the job is the right tool for the job. Unlike python, bash
> has a very stable grammar.
> 
> On Thu, Jan 9, 2020 at 12:34 PM la...@apache.org  wrote:
> 
>> Hi all,
>>
>> python2 is officially EOL'd. No more changes, improvements, or fixes will
>> be done by the developers.
>> Some Linux distributions stopped shipping Python2.
>>
>> It turns out our scripts do not work with Python3, see: [PHOENIX-5656]
>> Make Phoenix scripts work with Python 3 - ASF JIRA.
>>
>> [PHOENIX-5656] Make Phoenix scripts work with Python 3 - ASF JIRA
>>
>> So what should we do?
>>
>> As outlined in the jira we have 3 options:
>>
>>      1. Do nothing. Phoenix will only work with EOL'd Python 2.
>>      2. try to make all the scripts work with Python 2 and 3. That's
>> actually not possible in cases, but we can get close... And it's a lot of
>> work and experimentation.
>>      3. Convert all scripts to Python 3. There's a tool (2to3) to do that
>> automatically. Phoenix will now _only_ work with Python 3.
>>
>> Option 2 is some work - some of it not trivial - that someone would need
>> to pick up. Perhaps we can maintain two versions of all scripts, figure out
>> the version of Python and the use right one?
>>
>> Let's discuss on the jira. I can't be only one interested in this :)
>>
>> Cheers.
>>
>> -- Lars
>>
> 
> 
  

[jira] [Reopened] (PHOENIX-5644) IndexUpgradeTool should sleep only once if there is at least one immutable table provided

2020-01-12 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened PHOENIX-5644:


Re-opening until the tests are fixed.

> IndexUpgradeTool should sleep only once if there is at least one immutable 
> table provided
> -
>
> Key: PHOENIX-5644
> URL: https://issues.apache.org/jira/browse/PHOENIX-5644
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.14.3
>Reporter: Swaroopa Kadam
>Assignee: Swaroopa Kadam
>Priority: Minor
> Fix For: 5.1.0, 4.15.1, 4.14.4, 4.16.0
>
> Attachments: PHOENIX-5644.4.x-HBase-1.3.patch, 
> PHOENIX-5644.4.x-HBase-1.3.v1.patch, PHOENIX-5644.4.x-HBase-1.3.v2.patch, 
> PHOENIX-5644.4.x-HBase-1.3.v3.patch, PHOENIX-5644.v1.patch
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (PHOENIX-5675) IndexUpgradeTool should allow verify options for IndexTool run

2020-01-12 Thread Priyank Porwal (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Priyank Porwal reassigned PHOENIX-5675:
---

Assignee: Swaroopa Kadam

> IndexUpgradeTool should allow verify options for IndexTool run
> --
>
> Key: PHOENIX-5675
> URL: https://issues.apache.org/jira/browse/PHOENIX-5675
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.15.1, 4.14.3
>Reporter: Priyank Porwal
>Assignee: Swaroopa Kadam
>Priority: Major
> Fix For: 4.15.1, 4.14.4
>
>
> PHOENIX-5658 & PHOENIX-5674 add IndexTool options for before/after 
> verifications.
> IndexUpgraeTool must allow passthru of these IndexTool options when 
> submitting rebuild jobs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5675) IndexUpgradeTool should allow verify options for IndexTool run

2020-01-12 Thread Priyank Porwal (Jira)
Priyank Porwal created PHOENIX-5675:
---

 Summary: IndexUpgradeTool should allow verify options for 
IndexTool run
 Key: PHOENIX-5675
 URL: https://issues.apache.org/jira/browse/PHOENIX-5675
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 4.14.3, 4.15.1
Reporter: Priyank Porwal
 Fix For: 4.15.1, 4.14.4


PHOENIX-5658 & PHOENIX-5674 add IndexTool options for before/after 
verifications.

IndexUpgraeTool must allow passthru of these IndexTool options when submitting 
rebuild jobs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (PHOENIX-5674) IndexTool to not write already correct index rows/CFs

2020-01-12 Thread Priyank Porwal (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Priyank Porwal reassigned PHOENIX-5674:
---

Assignee: Kadir OZDEMIR

> IndexTool to not write already correct index rows/CFs
> -
>
> Key: PHOENIX-5674
> URL: https://issues.apache.org/jira/browse/PHOENIX-5674
> Project: Phoenix
>  Issue Type: Improvement
>Affects Versions: 4.15.1, 4.14.3
>Reporter: Priyank Porwal
>Assignee: Kadir OZDEMIR
>Priority: Major
> Fix For: 4.15.1, 4.14.4
>
>
> IndexTool can avoid writing index rows if they are already consistent with 
> data-table. This will specially be useful when rebuilding index on DR-site 
> where indexes are replicated already, but rebuild might be needed for catchup.
> Likewise, during upgrades from old indexing scheme to new consistent indexing 
> scheme, if the index data columns are consistent already, IndexTool should 
> only rewrite the EmptyColumn to mark the row as verified instead of writing 
> the data columns too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5674) IndexTool to not write already correct index rows/CFs

2020-01-12 Thread Priyank Porwal (Jira)
Priyank Porwal created PHOENIX-5674:
---

 Summary: IndexTool to not write already correct index rows/CFs
 Key: PHOENIX-5674
 URL: https://issues.apache.org/jira/browse/PHOENIX-5674
 Project: Phoenix
  Issue Type: Improvement
Affects Versions: 4.14.3, 4.15.1
Reporter: Priyank Porwal
 Fix For: 4.15.1, 4.14.4


IndexTool can avoid writing index rows if they are already consistent with 
data-table. This will specially be useful when rebuilding index on DR-site 
where indexes are replicated already, but rebuild might be needed for catchup.

Likewise, during upgrades from old indexing scheme to new consistent indexing 
scheme, if the index data columns are consistent already, IndexTool should only 
rewrite the EmptyColumn to mark the row as verified instead of writing the data 
columns too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (PHOENIX-5645) GlobalIndexChecker should prevent compaction from purging very recently deleted cells

2020-01-12 Thread Geoffrey Jacoby (Jira)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Jacoby updated PHOENIX-5645:
-
Attachment: PHOENIX-5645-4.x-HBase-1.5.v3.patch

> GlobalIndexChecker should prevent compaction from purging very recently 
> deleted cells
> -
>
> Key: PHOENIX-5645
> URL: https://issues.apache.org/jira/browse/PHOENIX-5645
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Geoffrey Jacoby
>Assignee: Geoffrey Jacoby
>Priority: Major
> Attachments: PHOENIX-5645-4.x-HBase-1.5-v2.patch, 
> PHOENIX-5645-4.x-HBase-1.5.patch, PHOENIX-5645-4.x-HBase-1.5.v3.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> IndexTool rebuilds and index scrutiny can both give strange, incorrect 
> results if a major compaction occurs in the middle of their run. In the 
> rebuild case, it's because we're rewriting "history" on the index at the same 
> time that compaction is rewriting "history" by purging deleted and expired 
> cells. 
> In the case of scrutiny, it's because it does an SCN-based lookback, and if 
> versions are purged on the index before their equivalent data table rows, you 
> can get false errors. 
> Since in the new indexing path we already have a coprocessor on each index, 
> it should override the compaction hook to shield rows newer than some 
> configurable age from being purged during a major compaction.
> In the future, this should be contributed as a general feature to HBase for 
> arbitrary tables. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (PHOENIX-5673) The mutation state is silently getting cleared on the execution of any DDL

2020-01-12 Thread Sandeep Guggilam (Jira)
Sandeep Guggilam created PHOENIX-5673:
-

 Summary: The mutation state is silently getting cleared on the 
execution of any DDL
 Key: PHOENIX-5673
 URL: https://issues.apache.org/jira/browse/PHOENIX-5673
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.15.0
Reporter: Sandeep Guggilam


When we execute any DDL statement, the mutations state is rolled back silently 
without informing the user. It should probably throw an exception saying that 
the mutation state is not empty when executing any DDL. See the below example:

 

Steps to reproduce:

create table t1 (pk varchar not null primary key, mycol varchar)

upsert into t1 (pk, mycol) values ('x','x');

create table t2 (pk varchar not null primary key, mycol varchar)

When we try to execute the above statements and do a conn.commit() at the end, 
it would silently rollback the upsert statement when we execute the second 
create statement and you wouldn't see the ('x', 'x') values in the first table. 
Instead it should probably throw an exception saying that the mutation state is 
not empty



--
This message was sent by Atlassian Jira
(v8.3.4#803005)