Re: Next Pig release proposal

Thejas Nair Tue, 25 Oct 2011 10:01:57 -0700

Dmitriy,

I haven't understood how you propose the code in future trunk would getinto 1.x releases, once the 1.0 is out. Will it be possible to meet thestability criteria that you are applying to 1.0 for all 1.x releases ?

Are you suggesting that all future releases go into a 0.x release beforegoing into 1.x release ?


-Thejas

On 10/25/11 12:30 AM, Dmitriy Ryaboy wrote:

Thanks Santhosh, you understood my meaning precisely.

I believe that unlike other releases, 1.0 is "special" in people's
minds, it's a "we are ready" label. I don't think we should promote
trunk, even in gloriously stable future, to 1.0, for the same reason I
don't think we should promote the current trunk to 1.0 (but would be
ok with, say, promoting 9.2...) -- our 1.0 release should be stable,
and trunk is not by nature, even when we make a release off it. I
would prefer we not tie our hands in search of stability and avoid
adding new features / refactors / etc in trunk because it's got the
1.0 label hanging over it. We already have a history of policing what
goes into dot-dot releases (8.1, 9.1), and I think those have been
working well for creating more stable releases while we can do more
radical stuff on trunk.

On Mon, Oct 24, 2011 at 11:53 PM, Santhosh Srinivasan<[email protected]>  
wrote:

My understanding of Dmitriy's proposal is as follows:

1. We need to establish (more) stability before we transition to 1.0
2. Assuming that we have satisfied the constraints in point #1 and that we are 
calling the next release 0.10.0, we could have the following situation:
        i. Trunk
        ii. 0.11.0 branch
        iii. 0.10.0 branch
        iv. 0.9.X branch
3. The next release on trunk will be 1.2
4. The next dot release on the 0.11.0 branch will be  1.1, the next dot release 
on 0.10.0 will be 1.0, etc.

I am with Dmitriy on the first two points. For the latter two points, I would 
approach it as follows:

1. The next release on trunk will 1.0
2. The 0.11.0 dot releases will continue with 0.11.1, 0.11.2, etc., 0.10.0 dot 
releases will continue with 0.10.1, 0.10.2, etc.
3. All subsequent releases based off of trunk and the 1.0 branch will bear the 
1.X.Y signature till we hit the next major release

Santhosh

-----Original Message-----
From: Dmitriy Ryaboy [mailto:[email protected]]
Sent: Monday, October 24, 2011 6:46 PM
To: [email protected]
Subject: Re: Next Pig release proposal

I am just saying based on what's in trunk, 10.0 should not be 1.0. I am not 
sure alphas and betas will work cause people just won't install them...

Olga -- my diagramming skills leave something to be desired :). I was just 
saying we let 0.10 stabilize (via a few dot releases), then move all versions 
up in one fell swoop -- so 0.10 line becomes 1.0, 0.11 becomes 1.1, and if at 
that point 0.12 also exists, it becomes 1.2.

D

On Mon, Oct 24, 2011 at 6:10 PM, Thejas Nair<[email protected]>  wrote:

Dmitriy,
I think what you are saying is something similar to alpha/beta releases.
(maybe beta1, beta2 .. is better).
So the first release could be 1.0.0_beta1. I scheme will be easier for
users to understand.
But I am not sure what the criteria for promoting a release from betaX
to general release should be.


Thanks,
Thejas



On 10/24/11 5:38 PM, Dmitriy Ryaboy wrote:

To be a little more concrete about what I am saying here -- I don't
think we should put a "1.0" label on any *.0 release. 0.8.1 is pretty
solid; 0.9.0 has some holes, 0.9.1 is better. If we put 1.0 on what
is currently being thought of as 0.10, it will have some stability /
usability issues (things tend to show up after we make a release and
people in the wild start trying it), and those issues will make a
poor impression on those who expect 1.0 to be shiny and polished
after so much time. I'm in favor of waiting a couple of dot releases,
promoting a stabilized release into 1.0, and going from there. So,
pictorially:

-- trunk --- 0.11-dev ----------0.12-dev------------**------| 1.2-dev!
     \               \
      \               \ ---------------- 0.11.0 --------------------|
1.1.0!
       \
        \------- 0.10.0 ------- 0.10.1 ------- 0.10.2 --------| 1.0.0 !!

On Mon, Oct 24, 2011 at 12:43 PM, Dmitriy Ryaboy<[email protected]>
  wrote:

  I am good with Scheme 2.


We are finding a fair number of issues trying to move from Pig 0.8.1
to 0.9, and I don't think those issues are fixed in 10, either.. not
sure that this "stabilization" process has happened yet.

D


On Mon, Oct 24, 2011 at 11:59 AM, Daniel
Dai<[email protected]>**
wrote:

  Yes, we need a versioning scheme. There are two versioning scheme I
can

think of:

Scheme 1:
<major>.<patch>
<major>    will be the feature rich release every 3 month<patch>
will be the bug fix release when necessary

Nov release will be 1.0, Feb release will be 2.0. There will be
1.1, 2.1 etc for bug fixes.

Scheme 2:
<major>.<minor>.<patch>
Most of our 3 month release will be counted as<minor>    release
unless there are major user facing/disruptive changes.

Nov release will be 1.0.0, Feb release will be 1.1.0. There will be
1.0.1,
1.1.1 etc for bug fixes.

I personally prefer scheme 2, increasing major version too
frequently might be confusing to users. How's other folks feel?

Daniel


On Sat, Oct 22, 2011 at 2:31 AM, Gianmarco De Francisci Morales<
[email protected]>    wrote:

  Hi,


just my 2 cents.

I think the issue here is not 1.0 vs 0.10, but what's the
versioning

scheme

we want to use for Pig.
Up to now it has been just an increasing number after a '0.'
prefix, changed when the community felt it was time. I think this
works well for a small project, but it is somewhat fuzzy.

I like the idea of having<major>.<minor>.<patch>    versions like
many

other

projects. It's a very clear and almost standard way of versioning
a

piece

of
software. It has clear rules on when to change each of the
numbers, and lets the user get an idea of backward compatibility
at a glance.

So, to conclude, I am in favor of going 1.0 (or 1.0.0) as long as
we

decide

a clear versioning policy (whichever it is).
So that the 1.0 milestone would mark the beginning of our new policy.

Cheers,
--
Gianmarco



On Fri, Oct 21, 2011 at
23:10,<Milind.Bhandarkar@emc.**com<[email protected]>>
  wrote:

  If one were to rewrite input and output formats to use the
webhdfs://

APIs, this would not be an issue, right ?

- milind


On 10/21/11 1:50 PM, "Santhosh Srinivasan"<[email protected]>    wrote:

  If I was not clear in my earlier email, I apologize for the lack
of

clarity. I am no longer in favour of waiting for Hadoop API
stability across Hadoop versions. It's a pipe dream.

When we had PigInputFormat and PigOutputFormat, your reasoning
would

be

spot on. I am concerned about the following. Our tight integration

with

Hadoop due to the use of Input and Output format might lead to a

break

in

backward compatibility. I am not sure if the comparison with that
of

Java

is valid. Probably a majority of the users don't use JNI. Its
very

hard

to use Pig without writing custom load and store functions. The

default

load and store don't suffice for a majority of use cases that I
have

observed.

I am trying to get all factors that might influence this decision.

From

the few emails that have been exchanged since yesterday, we have
the

following factors:

1. Hadoop 0.20.205 (support for Append) 2. Hadoop 0.22 3. Hadoop
0.23 4. Maturity of the new parser 5. Stability of the new
logical plan 6. Other components in the eco system.
       - Avro (1.5.4, 1.4.1, ...)
       - Cassandra (1.0.0, 0.8.7, ...)
       - Chukwa (0.4.0, 0.3.0, ...)
       - Hama (0.3.0, 0.2.0, ...)
       - Hbase (0.90.4, 0.90.3, 0.90.2, 0.90.1, ...)
       - Hive (Releases - 0.7.1, 0.7.0, 0.6.0, ...)
       - Zookeeper (3.3.3, 3.3.2, 3.2.2, 3.1.2, ...)

Santhosh


-----Original Message-----
From: Thejas Nair [mailto:[email protected]**]
Sent: Friday, October 21, 2011 11:22 AM
To: [email protected]
Subject: Re: Next Pig release proposal


Santosh,
I thought you meant API stability for hadoop across major
versions,

but

guess you are referring to stability within 0.23 versions. But

argument

applies to that as well, if 0.23.1 is not compatible with 0.23.0,
we

need

to call the release for 0.23.1 as 'pig 1.x for 0.23.1 api' .


We just need to communicate to the users that the
InputFormat/OutputFormat api's (and any anything else we expose
from
hadoop) depends on the hadoop version they are using.

I think it is just like different JNI libraries that you would
write

for

different OS. But the java version remains the same across OSs.


-Thejas


On 10/21/11 10:59 AM, Santhosh Srinivasan wrote:

Thejas,

I guess you did not read my email completely. You are referring
to

the

premise without examining the conclusion. I am repasting my entire

email

to avoid confusion (I hate truncated references). If you could

respond

again, it will bring us onto the same page.


<email>

Ref: http://tinyurl.com/4ng8upa (last discussion on 1.0)

How far have we progressed from our last discussion in March.
There

was

no consensus on the 1.0 release. Opinions ranged from having more

releases to bake in the maturity of the new parser and logical
plan changes to compatibility with Hadoop API (was compared to
Social Security - a very hot topic these days).

My concerns were around Hadoop API stability. I have heard that
the APIs will not be stable for at least 1 year. This is taking
me away

from

the Hadoop API stability factor (They passed healthcare in that

duration. Really!) Do we want compatibility with 0.23 as a
gating

factor

- not sure if this is anywhere close to getting done in the near

future.

Will we support append (0.20.205)?


Btw, Hbase has been doing 0.90.1, 0.90.2, etc. So we can take a

look

at

this option too.


Santhosh



-----Original Message-----
From: Olga Natkovich [mailto:[email protected]]
Sent: Thursday, October 20, 2011 4:40 PM
To: [email protected]
Subject: Next Pig release proposal

Hi,

Here is what I propose we do for the next Pig release:


(1)    Branch early next week - we have major features  and many

bug

fixes in and will be fixing remaining bugs on the branch


(2)    Publish the release by 11/15 - that will give us a couple of
weeks to stabilize the branch and get last minute bug fixes in

(3)    Make this release a 1.0 release. Reasons to go for 1.0 and

not

0.10


a.       This release has minimal number of features and was

focused

on

code stabilization and bug fixes. We believe it will be a stable

release

<email/>

Thanks,
Santhosh

-----Original Message-----
From: Thejas Nair [mailto:[email protected]**]
Sent: Friday, October 21, 2011 10:45 AM
To: [email protected]
Subject: Re: Next Pig release proposal

On 10/20/11 4:58 PM, Santhosh Srinivasan wrote:

Ref: http://tinyurl.com/4ng8upa (last discussion on 1.0)

How far have we progressed from our last discussion in March.

There

was no consensus on the 1.0 release. Opinions ranged from having

more

releases to bake in the maturity of the new parser and logical
plan

changes to compatibility with Hadoop API (was compared to
Social Security - a very hot topic these days).

My concerns were around Hadoop API stability.


Over the next year or so, there are going to be two API
versions of hadoop to be supported - 0.20.x api's and 0.23
apis, as we will have userbase on both.

I think it is just a matter of releasing pig 1.0 for 0.20.x
api's

and

1.0 for 0.23.x api's.  We will have to come up with a numbering

scheme

that reflects 'for hadoop version X' in our pig releases,
regardless

of

it being 0.10 or 1.0.


As there will be support for different api's of hadoop in pig

releases,

I don't see a reason why the hadoop api stability should stop pig

from

going 1.0 .


-Thejas

Re: Next Pig release proposal

Reply via email to