Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread jincheng sun
Hi Chesnay, thanks for sharing your long term plan about
`flink-shaded-hadoop`!
Got your points. I also think we need the JIRA for that, at the follows
right time.
Best Regards,
Jincheng

Chesnay Schepler  于2019年3月18日周一 下午8:13写道:

> Long term plan _is_ to move flink-shaded-hadoop to flink-shaded, I believe
> there's even a JIRA for that.
>
> Until that is in place they _must_ have retain the flink version as
> otherwise we'd be unable to change them in follow-up releases without
> changing the version scheme again.
>
> And even after the move they will retain the flink-shaded version like all
> other flink-shaded modules, for the above reason.
>
> On 18.03.2019 12:10, jincheng sun wrote:
>
> Hi Chesnay,
>
> The artifacts to be released do not have a SNAPSHOT suffix:
>>
>> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/
>
> Thank you for providing this link. It's very useful for contributors who
> want to check the RC on YARN.
>
> My suggestion may not describe being clear, let me explain:
>
> 1. Since 1.8.0, Flink's release package will not contain the corresponding
> Hadoop dependency, then the user has two ways to get the required hadoop
> dependency:
>
>1). Download the existing Hadoop version on the Flink download page.
>2). Generate the version required by the user from the source code (see
> https://ci.apache.org/projects/flink/flink-docs-master/flinkDev/building.html#hadoop-versions)
> For example, version 2.6.1 is required: `mvn clean install -DskipTests
> -Dhadoop.version=2.6.1`.
>
> 2. About how to manage the JARs release of Hadoop dependencies:
>
>1). The name of Hadoop shaded version should not include Flink
> version,  take your link as an example:
>`.../flink-shaded-hadoop2-uber/2.4.1-1.8.0/xx.jar`
>`.../flink-shaded-hadoop2-uber/2.6.5-1.8.0/xx.jar`
>`.../flink-shaded-hadoop2-uber/2.7.5-1.8.0/xx.jar`
>`.../flink-shaded-hadoop2-uber/2.8.3-1.8.0/xx.jar`
> The above version name I think it is possible to change `2.4.1-1.8.0` to
> `2.4.1`. That is, the same version of `Hadoop` shade can be used in many
> Flink versions, such as 2.8.3 Hadoop is not only available for Flink-1.8.0,
> it can be used by Flink-1.8.x or it can be used by Flink-1.9.x. etc.
>
>2). Release the shaded-Hadoop independently:
>For a long-term,  we can release the shaded JARs independently and move
> `flink-shaded-hadoop` into `https://github.com/apache/flink-shaded`,  So
> I suggest that we can publish Hadoop versions independently,  and share
> them in multiple Flink versions.
>
> What do you think?
>
> Best,
> Jincheng
>
>
> Chesnay Schepler  于2019年3月18日周一 下午4:15写道:
>
>> We release SNAPSHOT artifacts for all module, see
>>
>> https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/
>> .
>>
>> The artifacts to be released do not have a SNAPSHOT suffix:
>>
>> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/
>>
>> Finally, we are already adding flink-shaded-hadoop to the optional
>> components section in this PR:
>> https://github.com/apache/flink-web/pull/180
>>
>> On 18.03.2019 08:55, jincheng sun wrote:
>> > -1
>> >
>> > Currently, we have released the Hadoop-related JRA as a snapshot
>> > version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
>> > <
>> https://repository.apache.org/content/groups/public/org/apache/flink/flink-shaded-hadoop2-uber/
>> >),
>> > I think we should release a stable version.
>> > When testing the release code on YARN, currently user cannot find out
>> the
>> > Hadoop dependency.  Although there is a download explanation for Hadoop
>> in
>> > PR [`Update Downloads page for Flink 1.8
>> > `], a 404 error
>> occurs
>> > when you click Download ( I had left detail comments in the PR).
>> >
>> > So, I suggest as follows:
>> >
>> >1. It would be better to add the changes for
>> > `downloads.html#optional-components`, add the Hadoop relation JARs
>> download
>> > link first.
>> >2. Then add instructions on how to get the dependencies of the
>> Hadoop or
>> > add the correct download link directly in the next VOTE mail, due to we
>> do
>> > not include Hadoop in `flink-dist`.
>> >3.  Release a stable version Hadoop-related JRAs.
>> >
>> > Then, contributors can test it more easily on YARN.  What do you think?
>> >
>> > Best,
>> > Jincheng
>> >
>> >
>> > Chesnay Schepler  于2019年3月15日周五 下午10:35写道:
>> >
>> >> -1
>> >>
>> >> Missing dependencies in NOTICE file of flink-dist (and by extension the
>> >> binary distribution).
>> >> * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
>> >>
>> >> On 14.03.2019 13:42, Aljoscha Krettek wrote:
>> >>> Hi everyone,
>> >>> Please review and vote on the release candidate 2 for Flink 1.8.0, as
>> >> follows:
>> >>> [ ] +1, Approve the release
>> >>> [ ] -1, Do not approve the release (please provide specific 

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread jincheng sun
Hi Alijoscha,

I have merged the following issues found in RC1 and RC2 into the
release-1.8 branch.

- Add `frocksdbjni` dependency in NOTICE - FLINK-11950
- Improve end-to-end test  - FLINK-11892
- Deprecated Window API - FLINK-11918

Currently, I am performing functional testing of YARN cluster mode and
multiple operating systems. I think these tests result will be valid for
the next RC as well.

Best,
Jincheng

Shaoxuan Wang  于2019年3月19日周二 上午11:45写道:

> I tested RC2 with the following items:
> - Maven Central Repository contains all artifacts
> - Built the source with Maven (ensured all source files have Apache
> headers)
> - Checked checksums and GPG files (for instance, flink-core-1.8.0.jar) that
> match the corresponding release files
> - Verified that the source archives do not contains any binaries
> - Manually executed the tests in IDE
>
> @Alijoscha, per the discussion in RC1, we should consider sending the
> release vote to the user group to gather more feedbacks.
> @Gordon and @Yu, I noticed there are some perf regressions occurred on
> Jan.29 (and consistently exist after that) for the tests
> of stateBackends.FS and stateBackends.ROCKS_INC.
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=stateBackends.FS&env=2&revs=200&equid=off&quarts=on&extr=on
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tumblingWindow&env=2&revs=200&equid=off&quarts=on&extr=on
> @Chesnay, how did you notice and capture the license Notice issue? It seems
> very difficult to track. I am trying to understand the way how we organized
> the license Notice. For this case, why do we only need to add the
> dependency of 5.17.2-artisans-1.0 to the Notice file of flink-dist? It
> seems there are other modules that bundles dependency of the
> flink-statebackend.
>
> Regards,
> Shaoxuan
>
>
>
> On Tue, Mar 19, 2019 at 10:49 AM Tzu-Li (Gordon) Tai 
> wrote:
>
> > Hi,
> >
> > The regressions in the benchmark were also brought up earlier in this
> > thread by Yu.
> > From the previous investigations, these are the commits that touched
> > relevant serializers (TupleSerializer, AvroSerializer, RowSerializer)
> > around Jan / Feb:
> >
> > TupleSerializer -
> > 73e4d0ecfd (Thu Feb 14 11:56:51 2019 +0800) [FLINK-10493] Migrate all
> > subclasses of TupleSerializerBase to use new serialization compatibility
> > abstractions
> >
> > AvroSerializer -
> > 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop canEqual()
> > from TypeSerializer
> > 479ebd5987 (Tue Jan 29 15:06:09 2019 +0800) [FLINK-11436] [avro] Manually
> > Java-deserialize AvroSerializer for backwards compatibility
> >
> > RowSerializer -
> > 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop canEqual()
> > from TypeSerializer
> > b434b32c08 (Wed Jan 30 22:53:27 2019 +0800) [FLINK-11329] [table]
> Migrating
> > the RowSerializer to use new compatibility API
> >
> > The odd thing is, the times of these commits don't really match the drops
> > in their respective benchmark result timeline.
> > For tupleKeyBy benchmark, the drop started around end of January, where
> as
> > the TupleSerializer was only last touched mid February.
> > For the serializerRow and serializerAvro benchmarks, the drop occurred
> > around mid February, where as the only commit around that time was
> > 09bb7bbc0f ([FLINK-9803] Drop canEqual() from TypeSerializer).
> >
> > The only possible explanation that I can provide for the AvroSerializer
> > benchmark drop for now, is due to 479ebd5987 (FLINK-11436).
> > That commit had to touch the `readObject` method of the AvroSerializer,
> > which introduced some type checks / casts.
> > This may have caused regression in deserializing the AvroSerializer
> itself,
> > which would have been accounted for in the job initialization phase of
> the
> > serializerAvro benchmark.
> > The commit should not have affected per-record performance of the
> > AvroSerializer.
> > However, again, the commit time for 479ebd5987 was end of January, where
> as
> > the benchmark result drop occurred around mid February for the
> > serializerAvro benchmark.
> >
> > We haven't managed to identify any solid causes so far, only the above
> > speculations.
> >
> > Cheers,
> > Gordon
> >
> >
> > On Tue, Mar 19, 2019 at 1:36 AM Stephan Ewen  wrote:
> >
> > > Piotr and me discovered a possible issue in the benchmarks.
> > >
> > > Looking at the time graphs, there seems to be one issue coming around
> end
> > > of January. It increased network throughput, but decreased overall
> > > performance and added more variation in time (possibly through GC).
> Check
> > > the trend in these graphs:
> > >
> > > Increased Throughput:
> > >
> > >
> >
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=networkThroughput.1000,100ms&env=2&revs=200&equid=off&quarts=on&extr=on
> > > Higher variance in count benchmark:
> > >
> > >
> >
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=benchmarkCount&env=2&revs=200&equid=off&quarts=on&extr=on
> > > Drop in tuple

Re: [REMINDER] Please add entries for newly added dependencies to NOTICE file

2019-03-18 Thread Ufuk Celebi
Hey Aljoscha,

thanks for bringing this up. I think that we should either integrate
checks for this into our CI/CD environment (using existing tools) or
add a conditional check for this into flink-bot in case a pom.xml was
modified. Otherwise it will be easy to forget in the future.

– Ufuk

On Mon, Mar 18, 2019 at 12:03 PM Aljoscha Krettek  wrote:
>
> Hi All,
>
> Please remember to add newly added dependencies to the NOTICE file of 
> flink-dist (which will then end up in NOTICE-binary and so on). Discovering 
> this late will cause delays in releases, as it is doing now.
>
> There is a handy guide that Chesnay and Till worked on that explains 
> licensing for Apache projects and Flink specifically: 
> https://cwiki.apache.org/confluence/display/FLINK/Licensing 
> 
>
> Best,
> Aljoscha


[jira] [Created] (FLINK-11961) Clear up and refactor the code generation of scalar functions and operators

2019-03-18 Thread Jark Wu (JIRA)
Jark Wu created FLINK-11961:
---

 Summary: Clear up and refactor the code generation of scalar 
functions and operators
 Key: FLINK-11961
 URL: https://issues.apache.org/jira/browse/FLINK-11961
 Project: Flink
  Issue Type: Improvement
  Components: SQL / Planner
Reporter: Jark Wu


Currently, the code generation of scalar functions and operators are complex 
and messy. 

There are several ways to support codegen for a function/operator:
(1) Implement {{generate...}} in {{ScalarOperatorGens}} and invoke it in the 
big match pattern of {{ExprCodeGenerator}}.
(2) Implement a {{CallGenerator}} and add it to {{FunctionGenerator}}.
(3) Implement a util method and add it to {{BuiltinMethods}} and 
{{FunctionGenerator}}.

It will confuse developer which is the most efficient way to implement a 
function.

In this issue, we will propose a unified way to code generate 
functions/operators.

Some initial idea:
1. Introduce an {{ExprCodeGen}} interface, and all the function/operators 
should extend this to implement the {{codegen}} method. It's like a combination 
of {{PlannerExpression}} and {{CallGenerator}}. 
2. Rename {{ExprCodeGenerator}} to {{RexCodeGenerator}}.
3. Use a big match pattern to mapping {{RexCall}} to specific {{ExprCodeGen}}


{code:scala}
trait ExprCodeGen {

  def operands: Seq[GeneratedExpression]

  def resultType: InternalType

  def codegen(ctx: CodeGeneratorContext): GeneratedExpression
}

case class ConcatCodeGen(operands: Seq[GeneratedExpression]) extends 
ExprCodeGen {

  override def resultType: InternalType = InternalTypes.STRING

  override def codegen(ctx: CodeGeneratorContext): GeneratedExpression = {
nullSafeCodeGen(ctx) {
  terms => s"$BINARY_STRING.concat(${terms.mkString(", ")})"
}

  }
}
{code}










--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11960) Check all the builtin functions and operators are reasonable

2019-03-18 Thread Jark Wu (JIRA)
Jark Wu created FLINK-11960:
---

 Summary: Check all the builtin functions and operators are 
reasonable
 Key: FLINK-11960
 URL: https://issues.apache.org/jira/browse/FLINK-11960
 Project: Flink
  Issue Type: Task
  Components: API / Table SQL
Reporter: Jark Wu


We introduced a lot of functions and operators in 
{{flink-table-planner-blink}}. Most of them are not discussed in the community, 
some of them may break the behavior of current Flink SQL.

We should re-visit the functions and operators to accept the reasonable ones 
and remove the un-standard ones.

Here is a list of all the Blink SQL functions and operators: 
https://github.com/apache/flink/blob/master/flink-table/flink-table-planner-blink/src/main/java/org/apache/flink/table/functions/sql/FlinkSqlOperatorTable.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11959) Introduce window operator for blink streaming runtime

2019-03-18 Thread Kurt Young (JIRA)
Kurt Young created FLINK-11959:
--

 Summary: Introduce window operator for blink streaming runtime
 Key: FLINK-11959
 URL: https://issues.apache.org/jira/browse/FLINK-11959
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Operators
Reporter: Kurt Young
Assignee: Kurt Young


We introduced a new window operator in blink streaming runtime, the differences 
between blink's window operator and the one used in DataStream API are:
 # The blink's window operator is mainly used by window aggregate. It work 
closely with SQL's aggregate function, hence we didn't provide the flexibility 
to apply arbitrary `WindowFunction` like DataStream did. Instead, we only need 
to save the intermediate accumulate state for aggregate functions. There is no 
need for us to save original input rows into state, which will be much more 
efficient.
 # This new window operator can deal with retract messages.
 # We did some pane based optimization within sliding window operator, similar 
with [FLINK-7001|https://issues.apache.org/jira/browse/FLINK-7001]. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Shaoxuan Wang
I tested RC2 with the following items:
- Maven Central Repository contains all artifacts
- Built the source with Maven (ensured all source files have Apache headers)
- Checked checksums and GPG files (for instance, flink-core-1.8.0.jar) that
match the corresponding release files
- Verified that the source archives do not contains any binaries
- Manually executed the tests in IDE

@Alijoscha, per the discussion in RC1, we should consider sending the
release vote to the user group to gather more feedbacks.
@Gordon and @Yu, I noticed there are some perf regressions occurred on
Jan.29 (and consistently exist after that) for the tests
of stateBackends.FS and stateBackends.ROCKS_INC.
http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=stateBackends.FS&env=2&revs=200&equid=off&quarts=on&extr=on
http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tumblingWindow&env=2&revs=200&equid=off&quarts=on&extr=on
@Chesnay, how did you notice and capture the license Notice issue? It seems
very difficult to track. I am trying to understand the way how we organized
the license Notice. For this case, why do we only need to add the
dependency of 5.17.2-artisans-1.0 to the Notice file of flink-dist? It
seems there are other modules that bundles dependency of the
flink-statebackend.

Regards,
Shaoxuan



On Tue, Mar 19, 2019 at 10:49 AM Tzu-Li (Gordon) Tai 
wrote:

> Hi,
>
> The regressions in the benchmark were also brought up earlier in this
> thread by Yu.
> From the previous investigations, these are the commits that touched
> relevant serializers (TupleSerializer, AvroSerializer, RowSerializer)
> around Jan / Feb:
>
> TupleSerializer -
> 73e4d0ecfd (Thu Feb 14 11:56:51 2019 +0800) [FLINK-10493] Migrate all
> subclasses of TupleSerializerBase to use new serialization compatibility
> abstractions
>
> AvroSerializer -
> 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop canEqual()
> from TypeSerializer
> 479ebd5987 (Tue Jan 29 15:06:09 2019 +0800) [FLINK-11436] [avro] Manually
> Java-deserialize AvroSerializer for backwards compatibility
>
> RowSerializer -
> 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop canEqual()
> from TypeSerializer
> b434b32c08 (Wed Jan 30 22:53:27 2019 +0800) [FLINK-11329] [table] Migrating
> the RowSerializer to use new compatibility API
>
> The odd thing is, the times of these commits don't really match the drops
> in their respective benchmark result timeline.
> For tupleKeyBy benchmark, the drop started around end of January, where as
> the TupleSerializer was only last touched mid February.
> For the serializerRow and serializerAvro benchmarks, the drop occurred
> around mid February, where as the only commit around that time was
> 09bb7bbc0f ([FLINK-9803] Drop canEqual() from TypeSerializer).
>
> The only possible explanation that I can provide for the AvroSerializer
> benchmark drop for now, is due to 479ebd5987 (FLINK-11436).
> That commit had to touch the `readObject` method of the AvroSerializer,
> which introduced some type checks / casts.
> This may have caused regression in deserializing the AvroSerializer itself,
> which would have been accounted for in the job initialization phase of the
> serializerAvro benchmark.
> The commit should not have affected per-record performance of the
> AvroSerializer.
> However, again, the commit time for 479ebd5987 was end of January, where as
> the benchmark result drop occurred around mid February for the
> serializerAvro benchmark.
>
> We haven't managed to identify any solid causes so far, only the above
> speculations.
>
> Cheers,
> Gordon
>
>
> On Tue, Mar 19, 2019 at 1:36 AM Stephan Ewen  wrote:
>
> > Piotr and me discovered a possible issue in the benchmarks.
> >
> > Looking at the time graphs, there seems to be one issue coming around end
> > of January. It increased network throughput, but decreased overall
> > performance and added more variation in time (possibly through GC). Check
> > the trend in these graphs:
> >
> > Increased Throughput:
> >
> >
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=networkThroughput.1000,100ms&env=2&revs=200&equid=off&quarts=on&extr=on
> > Higher variance in count benchmark:
> >
> >
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=benchmarkCount&env=2&revs=200&equid=off&quarts=on&extr=on
> > Drop in tuple-key-by performance trend:
> >
> >
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tupleKeyBy&env=2&revs=200&equid=off&quarts=on&extr=on
> >
> > In addition, the Avro and Row serializers seem to have a performance drop
> > since mid February:
> >
> >
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerAvro&env=2&revs=200&equid=off&quarts=on&extr=on
> >
> >
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerRow&env=2&revs=200&equid=off&quarts=on&extr=on
> >
> > @Gordon any idea what could be the cause of this?
> >
> >
> > On Mon, Mar 18, 2019 at 3:08 PM Yu Li  wrote:
> >
> > > Watching the benchmark data for days and indeed it's normal

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Tzu-Li (Gordon) Tai
Hi,

The regressions in the benchmark were also brought up earlier in this
thread by Yu.
>From the previous investigations, these are the commits that touched
relevant serializers (TupleSerializer, AvroSerializer, RowSerializer)
around Jan / Feb:

TupleSerializer -
73e4d0ecfd (Thu Feb 14 11:56:51 2019 +0800) [FLINK-10493] Migrate all
subclasses of TupleSerializerBase to use new serialization compatibility
abstractions

AvroSerializer -
09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop canEqual()
from TypeSerializer
479ebd5987 (Tue Jan 29 15:06:09 2019 +0800) [FLINK-11436] [avro] Manually
Java-deserialize AvroSerializer for backwards compatibility

RowSerializer -
09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop canEqual()
from TypeSerializer
b434b32c08 (Wed Jan 30 22:53:27 2019 +0800) [FLINK-11329] [table] Migrating
the RowSerializer to use new compatibility API

The odd thing is, the times of these commits don't really match the drops
in their respective benchmark result timeline.
For tupleKeyBy benchmark, the drop started around end of January, where as
the TupleSerializer was only last touched mid February.
For the serializerRow and serializerAvro benchmarks, the drop occurred
around mid February, where as the only commit around that time was
09bb7bbc0f ([FLINK-9803] Drop canEqual() from TypeSerializer).

The only possible explanation that I can provide for the AvroSerializer
benchmark drop for now, is due to 479ebd5987 (FLINK-11436).
That commit had to touch the `readObject` method of the AvroSerializer,
which introduced some type checks / casts.
This may have caused regression in deserializing the AvroSerializer itself,
which would have been accounted for in the job initialization phase of the
serializerAvro benchmark.
The commit should not have affected per-record performance of the
AvroSerializer.
However, again, the commit time for 479ebd5987 was end of January, where as
the benchmark result drop occurred around mid February for the
serializerAvro benchmark.

We haven't managed to identify any solid causes so far, only the above
speculations.

Cheers,
Gordon


On Tue, Mar 19, 2019 at 1:36 AM Stephan Ewen  wrote:

> Piotr and me discovered a possible issue in the benchmarks.
>
> Looking at the time graphs, there seems to be one issue coming around end
> of January. It increased network throughput, but decreased overall
> performance and added more variation in time (possibly through GC). Check
> the trend in these graphs:
>
> Increased Throughput:
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=networkThroughput.1000,100ms&env=2&revs=200&equid=off&quarts=on&extr=on
> Higher variance in count benchmark:
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=benchmarkCount&env=2&revs=200&equid=off&quarts=on&extr=on
> Drop in tuple-key-by performance trend:
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tupleKeyBy&env=2&revs=200&equid=off&quarts=on&extr=on
>
> In addition, the Avro and Row serializers seem to have a performance drop
> since mid February:
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerAvro&env=2&revs=200&equid=off&quarts=on&extr=on
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerRow&env=2&revs=200&equid=off&quarts=on&extr=on
>
> @Gordon any idea what could be the cause of this?
>
>
> On Mon, Mar 18, 2019 at 3:08 PM Yu Li  wrote:
>
> > Watching the benchmark data for days and indeed it's normalized for the
> > time being. However, the result seems to be unstable. I also tried the
> > benchmark locally and observed obvious wave even with the same commit...
> >
> > I guess we may need to improve it such as increasing the
> > RECORDS_PER_INVOCATION to generate a reproducible result. IMHO a stable
> > micro benchmark is important to verify perf-related improvements (and I
> > think the benchmark and website are already great ones but just need some
> > love). Let me mark this as one of my backlog and will open a JIRA when
> > prepared.
> >
> > Anyway good to know it's not a regression, and thanks for the efforts
> spent
> > on checking it over! @Gordon @Chesnay
> >
> > Best Regards,
> > Yu
> >
> >
> > On Fri, 15 Mar 2019 at 19:20, Chesnay Schepler 
> wrote:
> >
> > > The regressions is already normalizing again. I'd observer it further
> > > before doing anything.
> > >
> > > The same applies to the benchmarkCount which tanked even more in that
> > > same run.
> > >
> > > On 15.03.2019 06:02, Tzu-Li (Gordon) Tai wrote:
> > > > @Yu
> > > > Thanks for reporting that Yu, great that this was noticed.
> > > >
> > > > The serializerAvro case seems to only be testing on-wire
> serialization.
> > > > I checked the changes to the `AvroSerializer`, and it seems like
> > > > FLINK-11436 [1] with commit 479ebd59 was the only change that may
> have
> > > > affected that.
> > > > That commit wasn't introduced exactly around the time when the
> > indicated
> > > > performance regression occurred, but was still before 

[jira] [Created] (FLINK-11958) flink on windows yarn deploy failed

2019-03-18 Thread Matrix42 (JIRA)
Matrix42 created FLINK-11958:


 Summary: flink on windows yarn deploy failed
 Key: FLINK-11958
 URL: https://issues.apache.org/jira/browse/FLINK-11958
 Project: Flink
  Issue Type: Bug
  Components: Deployment / YARN
Reporter: Matrix42
Assignee: Matrix42


Flink Version : 1.7.2

Hadoop Version:2.7.5

Yarn log:

Application application_1551710861615_0002 failed 1 times due to AM Container 
for appattempt_1551710861615_0002_01 exited with exitCode: 1
For more detailed output, check application tracking 
page:http://DESKTOP-919H80J:8088/cluster/app/application_1551710861615_0002Then,
 click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1551710861615_0002_01_01
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:585)
at org.apache.hadoop.util.Shell.run(Shell.java:482)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Shell output: 移动了 1 个文件。
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.

 

jobmanager.err:
'$JAVA_HOME' 不是内部或外部命令,也不是可运行的程序或批处理文件。

english: (Not internal or external commands, nor runnable programs or batch 
files)

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Create a Flink ecosystem website

2019-03-18 Thread Becket Qin
Done. The writeup looks great!

On Mon, Mar 18, 2019 at 9:09 PM Robert Metzger  wrote:

> Nice, really good news on the INFRA front!
> I think the hardware specs sound reasonable. And a periodic backup of the
> website's database to Infra's backup solution sounds reasonable too.
>
> Can you accept and review my proposal for the website?
>
>
> On Sat, Mar 16, 2019 at 3:47 PM Becket Qin  wrote:
>
>> >
>> > I have a very capable and motivated frontend developer who would be
>> > willing to implement what I've mocked in my proposal.
>>
>>
>> That is awesome!
>>
>> I created a Jira ticket[1] to Apache Infra and got the reply. It looks
>> that
>> Apache infra team could provide a decent VM. The last piece is how to
>> ensure the data is persisted so we won't lose the project info / user
>> feedbacks when the VM is down. If Apache infra does not provide a
>> persistent storage for DB backup, we can always ask for multiple VMs and
>> do
>> the fault tolerance by ourselves. It seems we can almost say the hardware
>> side is also ready.
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> [1] https://issues.apache.org/jira/browse/INFRA-18010
>>
>> On Fri, Mar 15, 2019 at 5:39 PM Robert Metzger 
>> wrote:
>>
>> > Thank you for reaching out to Infra and the ember client.
>> > When I first saw the Ember repository, I thought it is the whole thing
>> > (frontend and backend), but while testing it, I realized it is "only"
>> the
>> > frontend. I'm not sure if it makes sense to adjust the Ember observer
>> > client, or just write a simple UI from scratch.
>> > I have a very capable and motivated frontend developer who would be
>> > willing to implement what I've mocked in my proposal.
>> > In addition, I found somebody (Congxian Qiu) who seems to be eager to
>> help
>> > with this project for the backend:
>> > https://github.com/rmetzger/flink-community-tools/issues/4
>> >
>> > For Infra: I made the same experience when asking for more GitHub
>> > permissions for "flinkbot": They didn't respond on their mailing list,
>> only
>> > on Jira.
>> >
>> >
>> >
>> > On Thu, Mar 14, 2019 at 2:45 PM Becket Qin 
>> wrote:
>> >
>> >> Thanks for writing up the specifications.
>> >>
>> >> Regarding the website source code, Austin found a website[1] whose
>> >> frontend code[2] is available publicly. It lacks some support (e.g
>> login),
>> >> but it is still a good starting point. One thing is that I did not
>> find a
>> >> License statement for that source code. I'll reach out to the author
>> to see
>> >> if they have any concern over our usage.
>> >>
>> >> Apache Infra has not replied to my email regarding some details about
>> the
>> >> VM. I'll open an infra Jira ticket tomorrow if there is still no
>> response.
>> >>
>> >> Thanks,
>> >>
>> >> Jiangjie (Becket) Qin
>> >>
>> >> [1] https://emberobserver.com/
>> >> [2] https://github.com/emberobserver/client
>> >>
>> >>
>> >>
>> >> On Thu, Mar 14, 2019 at 1:35 AM Robert Metzger 
>> >> wrote:
>> >>
>> >>> @Bowen: I agree. Confluent Hub looks nicer, but it is on their company
>> >>> website. I guess the likelihood that they give out code from their
>> company
>> >>> website is fairly low.
>> >>> @Nils: Beam's page is similar to our Ecosystem page, which we'll
>> >>> reactivate as part of this PR:
>> >>> https://github.com/apache/flink-web/pull/187
>> >>>
>> >>> Spark-packages.org did not respond to my request.
>> >>> I will propose a short specification in Becket's initial document.
>> >>>
>> >>>
>> >>> On Mon, Mar 11, 2019 at 11:38 AM Niels Basjes 
>> wrote:
>> >>>
>>  Hi,
>> 
>>  The Beam project has something in this area that is simply a page
>>  within their documentation website:
>>  https://beam.apache.org/documentation/sdks/java-thirdparty/
>> 
>>  Niels Basjes
>> 
>>  On Fri, Mar 8, 2019 at 11:39 PM Bowen Li 
>> wrote:
>>  >
>>  > Confluent hub for Kafka is another good example of this kind. I
>>  personally like it over the spark site. May worth checking it out
>> with
>>  Kafka folks
>>  >
>>  > On Thu, Mar 7, 2019 at 6:06 AM Becket Qin 
>>  wrote:
>>  >>
>>  >> Absolutely! Thanks for the pointer. I'll submit a PR to update the
>>  >> ecosystem page and the navigation.
>>  >>
>>  >> Thanks,
>>  >>
>>  >> Jiangjie (Becket) Qin
>>  >>
>>  >> On Thu, Mar 7, 2019 at 8:47 PM Robert Metzger <
>> rmetz...@apache.org>
>>  wrote:
>>  >>
>>  >> > Okay. I will reach out to spark-packages.org and see if they
>> are
>>  willing
>>  >> > to share.
>>  >> >
>>  >> > Do you want to raise a PR to update the ecosystem page (maybe
>> sync
>>  with
>>  >> > the "Software Projects" listed here:
>>  >> >
>> https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink)
>>  and
>>  >> > link it in the navigation?
>>  >> >
>>  >> > Best,
>>  >> > Robert
>>  >> >
>>  >> >
>>  >> > On Thu, Mar 7, 2019 at 10:13 AM Becket Qin <
>> b

[jira] [Created] (FLINK-11957) Expose failure cause in the API response when dispatcher fails to submit a job

2019-03-18 Thread Mark Cho (JIRA)
Mark Cho created FLINK-11957:


 Summary: Expose failure cause in the API response when dispatcher 
fails to submit a job
 Key: FLINK-11957
 URL: https://issues.apache.org/jira/browse/FLINK-11957
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / REST
Affects Versions: 1.7.2
Reporter: Mark Cho


We use POST /jars/:jarid/run API endpoint to submit a Flink job

https://ci.apache.org/projects/flink/flink-docs-release-1.7/monitoring/rest_api.html#jars-jarid-run

 

Currently, whenever there is an error, API response only returns the following 
info:
{code:java}
{
  "errors": [
"org.apache.flink.runtime.client.JobSubmissionException: Failed to submit 
job."
  ]
}
{code}
Since job submission can fail for multiple reasons, it would be helpful to have 
some information that tells us why the job submission failed. Currently, we 
have to dig into the Flink logs to find the root cause.

 

Some examples of job submission failure can be:
{code:java}
java.lang.RuntimeException: 
org.apache.flink.runtime.client.JobExecutionException: Could not set up 
JobManager
at 
org.apache.flink.util.function.CheckedSupplier.lambda$unchecked$0(CheckedSupplier.java:36)
at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not set 
up JobManager
at 
org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:176)
at 
org.apache.flink.runtime.dispatcher.Dispatcher$DefaultJobManagerRunnerFactory.createJobManagerRunner(Dispatcher.java:1058)
at 
org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$5(Dispatcher.java:308)
at 
org.apache.flink.util.function.CheckedSupplier.lambda$unchecked$0(CheckedSupplier.java:34)
... 7 more
Caused by: java.io.FileNotFoundException: Cannot find checkpoint or savepoint 
file/directory 
's3://us-east-1.spaas.test/checkpoints/metadata/spaas_app_mcho-flink_bp_test/cee4-155266396689/fa82a7d2c8dfb6f7fb14bf2e319d4367/chk-969/_metadata'
 on file system 's3'.
at 
org.apache.flink.runtime.state.filesystem.AbstractFsCheckpointStorage.resolveCheckpointPointer(AbstractFsCheckpointStorage.java:241)
at 
org.apache.flink.runtime.state.filesystem.AbstractFsCheckpointStorage.resolveCheckpoint(AbstractFsCheckpointStorage.java:109)
at 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1100)
at 
org.apache.flink.runtime.jobmaster.JobMaster.tryRestoreExecutionGraphFromSavepoint(JobMaster.java:1241)
at 
org.apache.flink.runtime.jobmaster.JobMaster.createAndRestoreExecutionGraph(JobMaster.java:1165)
at org.apache.flink.runtime.jobmaster.JobMaster.(JobMaster.java:296)
at 
org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:157)
... 10 more
{code}
{code:java}
java.lang.RuntimeException: 
org.apache.flink.runtime.client.JobExecutionException: Could not set up 
JobManager at 
org.apache.flink.util.function.CheckedSupplier.lambda$unchecked$0(CheckedSupplier.java:36)
 at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
 at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39) at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
 at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not 
set up JobManager at 
org.apache.flink.runtime.jobmaster.JobManagerRunner.(JobManagerRunner.java:176)
 at 
org.apache.flink.runtime.dispatcher.Dispatcher$DefaultJobManagerRunnerFactory.createJobManagerRunner(Dispatcher.java:1058)
 at 
org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$5(Dispatcher.java:308)
 at 
org.apache.flink.util.function.CheckedSupplier.lambda$unchecked$0(CheckedSupplier.java:34)
 ... 7 more Caused by: org.apache.flink.util.FlinkRuntimeException: 
Incompatible failover strategy - strategy 'Individual Task Restart' can only 
handle jobs with only disconnected tasks. at 
org.apache.flink.runtime.executiongraph.failover.RestartIndividualStrategy.notifyNewVertices

Re: [ANNOUNCEMENT] March 2019 Bay Area Apache Flink Meetup

2019-03-18 Thread Xuefu Zhang
Hi all,

This is a reminder that the next Bay Area Flink Meetup is about one week
away. Please RSVP if you plan to attend. Here is the list of the proposed
talks:

-- General community and project updates (Fabian Hueske)
-- Real-time experimentation and beyond with Flink at Pinterest (Parag
Kesar, Steven Bairos-Novak)
-- Flink now has a persistent metastore (Xuefu Zhang)

See you all next Monday!

Regards,

Xuefu Zhang

On Thu, Mar 7, 2019 at 5:08 PM Xuefu Zhang  wrote:

> Hi all,
>
> As an update, this meetup will take place at 505 Brannan St · San
> Francisco, CA
> .
> Many thanks to Pinterest for their generosity of hosting the event.
>
> At the same time, I'd like to solicit for your help on the following:
>
> 1. RSVP at the meetup page if you're attending.
> 2. Submit your talk if you like to share.
> 3. Spread the word via all means.
>
> As the meetup will occur right at the time when the open source
> communities gather around Strata Data Conference
>  and Flink Forward SF
> , this event is hoped to be a
> great opportunity to reach out and get involved.
>
> Your kind of assistance is greatly appreciated!
>
> Regards,
> Xuefu
>
>
>
> On Wed, Mar 6, 2019 at 2:09 PM Xuefu Zhang  wrote:
>
>> Hi all,
>>
>> This is a kind reminder that our next Flink meetup will be a couple of
>> weeks away. This is the opportunity to share experience or gain insights,
>> or just get socialized in the community.
>>
>> RSVP is required, which can be done at the meetup webpage
>> .
>>
>> We are still finalizing the location. If you know any company could host,
>> please kindly let me know. Also, we can still accommodate a couple of
>> talks. Please also let me know if you present a talk.
>>
>> Thanks,
>> Xuefu
>>
>> On Thu, Feb 14, 2019 at 4:32 PM Xuefu Zhang  wrote:
>>
>>> Hi all,
>>>
>>> I'm very excited to announce that the community is planning the next
>>> meetup in Bay Area on March 25, 2019. The event is just announced on
>>> Meetup.com [1].
>>>
>>> To make the event successful, your participation and help will be
>>> needed. Currently, we are looking for an organization that can host the
>>> event. Please let me know if you have any leads.
>>>
>>> Secondly, we encourage Flink users and developers to take this as an
>>> opportunity to share experience or development. Thus, please let me know if
>>> you like to give a short talk.
>>>
>>> I look forward to meeting you all in the Meetup.
>>>
>>> Regards,
>>> Xuefu
>>>
>>> [1] https://www.meetup.com/Bay-Area-Apache-Flink-Meetup/events/258975465
>>>
>>


Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Stephan Ewen
Piotr and me discovered a possible issue in the benchmarks.

Looking at the time graphs, there seems to be one issue coming around end
of January. It increased network throughput, but decreased overall
performance and added more variation in time (possibly through GC). Check
the trend in these graphs:

Increased Throughput:
http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=networkThroughput.1000,100ms&env=2&revs=200&equid=off&quarts=on&extr=on
Higher variance in count benchmark:
http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=benchmarkCount&env=2&revs=200&equid=off&quarts=on&extr=on
Drop in tuple-key-by performance trend:
http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tupleKeyBy&env=2&revs=200&equid=off&quarts=on&extr=on

In addition, the Avro and Row serializers seem to have a performance drop
since mid February:
http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerAvro&env=2&revs=200&equid=off&quarts=on&extr=on
http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerRow&env=2&revs=200&equid=off&quarts=on&extr=on

@Gordon any idea what could be the cause of this?


On Mon, Mar 18, 2019 at 3:08 PM Yu Li  wrote:

> Watching the benchmark data for days and indeed it's normalized for the
> time being. However, the result seems to be unstable. I also tried the
> benchmark locally and observed obvious wave even with the same commit...
>
> I guess we may need to improve it such as increasing the
> RECORDS_PER_INVOCATION to generate a reproducible result. IMHO a stable
> micro benchmark is important to verify perf-related improvements (and I
> think the benchmark and website are already great ones but just need some
> love). Let me mark this as one of my backlog and will open a JIRA when
> prepared.
>
> Anyway good to know it's not a regression, and thanks for the efforts spent
> on checking it over! @Gordon @Chesnay
>
> Best Regards,
> Yu
>
>
> On Fri, 15 Mar 2019 at 19:20, Chesnay Schepler  wrote:
>
> > The regressions is already normalizing again. I'd observer it further
> > before doing anything.
> >
> > The same applies to the benchmarkCount which tanked even more in that
> > same run.
> >
> > On 15.03.2019 06:02, Tzu-Li (Gordon) Tai wrote:
> > > @Yu
> > > Thanks for reporting that Yu, great that this was noticed.
> > >
> > > The serializerAvro case seems to only be testing on-wire serialization.
> > > I checked the changes to the `AvroSerializer`, and it seems like
> > > FLINK-11436 [1] with commit 479ebd59 was the only change that may have
> > > affected that.
> > > That commit wasn't introduced exactly around the time when the
> indicated
> > > performance regression occurred, but was still before the regression.
> > > The commit introduced some instanceof type checks / type casting in the
> > > readObject of the AvroSerializer, which may have caused this.
> > >
> > > Currently investigating further.
> > >
> > > Cheers,
> > > Gordon
> > >
> > > On Fri, Mar 15, 2019 at 11:45 AM Yu Li  wrote:
> > >
> > >> Hi Aljoscha and all,
> > >>
> > >>  From our performance benchmark web site (
> > >> http://codespeed.dak8s.net:8000/changes/) I observed a noticeable
> > >> regression (-6.92%) on the serializerAvro case comparing the latest
> 100
> > >> revisions, which may need some attention. Thanks.
> > >>
> > >> Best Regards,
> > >> Yu
> > >>
> > >>
> > >> On Thu, 14 Mar 2019 at 20:42, Aljoscha Krettek 
> > >> wrote:
> > >>
> > >>> Hi everyone,
> > >>> Please review and vote on the release candidate 2 for Flink 1.8.0, as
> > >>> follows:
> > >>> [ ] +1, Approve the release
> > >>> [ ] -1, Do not approve the release (please provide specific comments)
> > >>>
> > >>>
> > >>> The complete staging area is available for your review, which
> includes:
> > >>> * JIRA release notes [1],
> > >>> * the official Apache source release and binary convenience releases
> to
> > >> be
> > >>> deployed to dist.apache.org  [2], which are
> > >>> signed with the key with fingerprint
> > >>> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> > >>> * all artifacts to be deployed to the Maven Central Repository [4],
> > >>> * source code tag "release-1.8.0-rc2" [5],
> > >>> * website pull request listing the new release [6]
> > >>> * website pull request adding announcement blog post [7].
> > >>>
> > >>> The vote will be open for at least 72 hours. It is adopted by
> majority
> > >>> approval, with at least 3 PMC affirmative votes.
> > >>>
> > >>> Thanks,
> > >>> Aljoscha
> > >>>
> > >>> [1]
> > >>>
> > >>
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> > >>> <
> > >>>
> > >>
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> > >>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
> > >>> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
> > >>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
> > >>> https://dist.apache.org/repos/dist/release

[DISCUSS] Introduction of a Table API Java Expression DSL

2019-03-18 Thread Timo Walther

Hi everyone,

some of you might have already noticed the JIRA issue that I opened 
recently [1] about introducing a proper Java expression DSL for the 
Table API. Instead of using string-based expressions, we should aim for 
a unified, maintainable, programmatic Java DSL.


Some background: The Blink merging efforts and the big refactorings as 
part of FLIP-32 have revealed many shortcomings in the current Table & 
SQL API design. Most of these legacy issues cause problems nowadays in 
making the Table API a first-class API next to the DataStream API. An 
example is the ExpressionParser class[2]. It was implemented in the 
early days of the Table API using Scala parser combinators. During the 
last years, this parser caused many JIRA issues and user confusion on 
the mailing list. Because the exceptions and syntax might not be 
straight forward.


For FLINK-11908, we added a temporary bridge instead of reimplementing 
the parser in Java for FLIP-32. However, this is only a intermediate 
solution until we made a final decision.


I would like to propose a new, parser-free version of the Java Table API:

https://docs.google.com/document/d/1r3bfR9R6q5Km0wXKcnhfig2XQ4aMiLG5h2MTx960Fg8/edit?usp=sharing

I already implemented an early protoype that shows that such a DSL is 
not much implementation effort and integrates nicely with all existing 
API methods.


What do you think?

Thanks for your feedback,

Timo

[1] https://issues.apache.org/jira/browse/FLINK-11890

[2] 
https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/expressions/PlannerExpressionParserImpl.scala




Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Yu Li
Watching the benchmark data for days and indeed it's normalized for the
time being. However, the result seems to be unstable. I also tried the
benchmark locally and observed obvious wave even with the same commit...

I guess we may need to improve it such as increasing the
RECORDS_PER_INVOCATION to generate a reproducible result. IMHO a stable
micro benchmark is important to verify perf-related improvements (and I
think the benchmark and website are already great ones but just need some
love). Let me mark this as one of my backlog and will open a JIRA when
prepared.

Anyway good to know it's not a regression, and thanks for the efforts spent
on checking it over! @Gordon @Chesnay

Best Regards,
Yu


On Fri, 15 Mar 2019 at 19:20, Chesnay Schepler  wrote:

> The regressions is already normalizing again. I'd observer it further
> before doing anything.
>
> The same applies to the benchmarkCount which tanked even more in that
> same run.
>
> On 15.03.2019 06:02, Tzu-Li (Gordon) Tai wrote:
> > @Yu
> > Thanks for reporting that Yu, great that this was noticed.
> >
> > The serializerAvro case seems to only be testing on-wire serialization.
> > I checked the changes to the `AvroSerializer`, and it seems like
> > FLINK-11436 [1] with commit 479ebd59 was the only change that may have
> > affected that.
> > That commit wasn't introduced exactly around the time when the indicated
> > performance regression occurred, but was still before the regression.
> > The commit introduced some instanceof type checks / type casting in the
> > readObject of the AvroSerializer, which may have caused this.
> >
> > Currently investigating further.
> >
> > Cheers,
> > Gordon
> >
> > On Fri, Mar 15, 2019 at 11:45 AM Yu Li  wrote:
> >
> >> Hi Aljoscha and all,
> >>
> >>  From our performance benchmark web site (
> >> http://codespeed.dak8s.net:8000/changes/) I observed a noticeable
> >> regression (-6.92%) on the serializerAvro case comparing the latest 100
> >> revisions, which may need some attention. Thanks.
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Thu, 14 Mar 2019 at 20:42, Aljoscha Krettek 
> >> wrote:
> >>
> >>> Hi everyone,
> >>> Please review and vote on the release candidate 2 for Flink 1.8.0, as
> >>> follows:
> >>> [ ] +1, Approve the release
> >>> [ ] -1, Do not approve the release (please provide specific comments)
> >>>
> >>>
> >>> The complete staging area is available for your review, which includes:
> >>> * JIRA release notes [1],
> >>> * the official Apache source release and binary convenience releases to
> >> be
> >>> deployed to dist.apache.org  [2], which are
> >>> signed with the key with fingerprint
> >>> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> >>> * all artifacts to be deployed to the Maven Central Repository [4],
> >>> * source code tag "release-1.8.0-rc2" [5],
> >>> * website pull request listing the new release [6]
> >>> * website pull request adding announcement blog post [7].
> >>>
> >>> The vote will be open for at least 72 hours. It is adopted by majority
> >>> approval, with at least 3 PMC affirmative votes.
> >>>
> >>> Thanks,
> >>> Aljoscha
> >>>
> >>> [1]
> >>>
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> >>> <
> >>>
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> >>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
> >>> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
> >>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
> >>> https://dist.apache.org/repos/dist/release/flink/KEYS>
> >>> [4]
> >> https://repository.apache.org/content/repositories/orgapacheflink-1213
> >>> <
> https://repository.apache.org/content/repositories/orgapacheflink-1210/
> >>>
> >>> [5]
> >>>
> >>
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd
> >>> <
> >>>
> >>
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=2d00b1c26d7b4554707063ab0d1d6cc236cfe8a5
> >>> [6] https://github.com/apache/flink-web/pull/180 <
> >>> https://github.com/apache/flink-web/pull/180>
> >>> [7] https://github.com/apache/flink-web/pull/179 <
> >>> https://github.com/apache/flink-web/pull/179>
> >>>
> >>> P.S. The difference to the previous RC1 is very small, you can fetch
> the
> >>> two tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” to see
> >> the
> >>> difference in commits. Its fixes for the issues that led to the
> >>> cancellation of the previous RC plus smaller fixes. Most
> >>> verification/testing that was carried out should apply as is to this
> RC.
>
>
>


Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Yu Li
@Aljoscha I see, thanks for the quick response!

Best Regards,
Yu


On Mon, 18 Mar 2019 at 19:11, Aljoscha Krettek  wrote:

> @Yu Thanks for the pointer. This is because I didn’t yet update the
> buildbot configuration for the new release. It’s a point that is very low
> in the release guide but I think I’ll do that now.
>
> > On 18. Mar 2019, at 09:37, Yu Li  wrote:
> >
> > One supplement for point #2: there's a Note on the doc for the error, but
> > I'm wondering why we don't directly remove the -DarchetypeCatalog option
> in
> > the command and tell users to specify the catalog in settings.xml if they
> > prefer to. I mean, user tends to try the command first before checking
> the
> > note and will get the error. Thanks.
> >
> > Best Regards,
> > Yu
> >
> >
> > On Mon, 18 Mar 2019 at 16:30, Yu Li  wrote:
> >
> >> Issues observed when checking quick start:
> >>
> >> 1. The versions on the document
> >> <
> https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/projectsetup/java_api_quickstart.html>
> are
> >> still "1.9-SNAPSHOT" instead of "1.8.0"
> >>
> >> 2. The "Use Maven archetypes" command failed with below error:
> >> [ERROR] Failed to execute goal
> >> org.apache.maven.plugins:maven-archetype-plugin:3.0.1:generate
> >> (default-cli) on project standalone-pom: archetypeCatalog '
> >> https://repository.apache.org/content/repositories/snapshots/' is not
> >> supported anymore. Please read the plugin documentation for details.
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Mon, 18 Mar 2019 at 16:15, Chesnay Schepler 
> wrote:
> >>
> >>> We release SNAPSHOT artifacts for all module, see
> >>>
> >>>
> https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/
> >>> .
> >>>
> >>> The artifacts to be released do not have a SNAPSHOT suffix:
> >>>
> >>>
> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/
> >>>
> >>> Finally, we are already adding flink-shaded-hadoop to the optional
> >>> components section in this PR:
> >>> https://github.com/apache/flink-web/pull/180
> >>>
> >>> On 18.03.2019 08:55, jincheng sun wrote:
>  -1
> 
>  Currently, we have released the Hadoop-related JRA as a snapshot
>  version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
>  <
> >>>
> https://repository.apache.org/content/groups/public/org/apache/flink/flink-shaded-hadoop2-uber/
>  ),
>  I think we should release a stable version.
>  When testing the release code on YARN, currently user cannot find out
> >>> the
>  Hadoop dependency.  Although there is a download explanation for
> Hadoop
> >>> in
>  PR [`Update Downloads page for Flink 1.8
>  `], a 404 error
> >>> occurs
>  when you click Download ( I had left detail comments in the PR).
> 
>  So, I suggest as follows:
> 
>    1. It would be better to add the changes for
>  `downloads.html#optional-components`, add the Hadoop relation JARs
> >>> download
>  link first.
>    2. Then add instructions on how to get the dependencies of the
> >>> Hadoop or
>  add the correct download link directly in the next VOTE mail, due to
> we
> >>> do
>  not include Hadoop in `flink-dist`.
>    3.  Release a stable version Hadoop-related JRAs.
> 
>  Then, contributors can test it more easily on YARN.  What do you
> think?
> 
>  Best,
>  Jincheng
> 
> 
>  Chesnay Schepler  于2019年3月15日周五 下午10:35写道:
> 
> > -1
> >
> > Missing dependencies in NOTICE file of flink-dist (and by extension
> the
> > binary distribution).
> > * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
> >
> > On 14.03.2019 13:42, Aljoscha Krettek wrote:
> >> Hi everyone,
> >> Please review and vote on the release candidate 2 for Flink 1.8.0,
> as
> > follows:
> >> [ ] +1, Approve the release
> >> [ ] -1, Do not approve the release (please provide specific
> comments)
> >>
> >>
> >> The complete staging area is available for your review, which
> >>> includes:
> >> * JIRA release notes [1],
> >> * the official Apache source release and binary convenience releases
> >>> to
> > be deployed to dist.apache.org  [2], which
> >>> are
> > signed with the key with fingerprint
> > F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> >> * all artifacts to be deployed to the Maven Central Repository [4],
> >> * source code tag "release-1.8.0-rc2" [5],
> >> * website pull request listing the new release [6]
> >> * website pull request adding announcement blog post [7].
> >>
> >> The vote will be open for at least 72 hours. It is adopted by
> majority
> > approval, with at least 3 PMC affirmative votes.
> >> Thanks,
> >> Aljoscha
> >>
> >> [1]
> >
> >>>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?proj

Re: [DISCUSS] A more restrictive JIRA workflow

2019-03-18 Thread Robert Metzger
@Fabian: I don't think this is a big problem. Moving away from "giving
everybody contributor permissions" to "giving it to some people" is not
risky.
I would leave this decision to the committers who are working with a person.


We should bring this discussion to a conclusion and implement the changes
to JIRA.


Nobody has raised any objections to the overall idea.

Points raised:
1. We need to update the contribution guide and describe the workflow.
2. I brought up changing Flinkbot so that it auto-closes PRs without
somebody assigned in JIRA.

Who wants to work on an update of the contribution guide?
If there's no volunteers, I'm happy to take care of this.


On Fri, Mar 15, 2019 at 9:20 AM Fabian Hueske  wrote:

> Hi,
>
> I'm not sure about adding an additional stage.
> Who's going to decide when to "promote" a user to a contributor, i.e.,
> grant assigning permission?
>
> Best, Fabian
>
> Am Do., 14. März 2019 um 13:50 Uhr schrieb Timo Walther <
> twal...@apache.org
> >:
>
> > Hi Robert,
> >
> > I also like the idea of making every Jira user an "Assignable User", but
> > restrict assigning a ticket to people with committer permissions.
> >
> > Instead of giving contributor permissions to everyone, we could have a
> > more staged approach from user, to contributor, and finally to committer.
> >
> > Once people worked on a couple of JIRA issues, we can make them
> > contributors.
> >
> > What do you think?
> >
> > Regards,
> > Timo
> >
> > Am 06.03.19 um 12:33 schrieb Robert Metzger:
> > > Hi Tison,
> > > I also thought about this.
> > > Making a person a "Contributor" is required for being an "Assignable
> > User",
> > > so normal Jira accounts can't be assigned to a ticket.
> > >
> > > We could make every Jira user an "Assignable User", but restrict
> > assigning
> > > a ticket to people with committer permissions.
> > > There are some other permissions attached to the "Contributor" role,
> such
> > > as "Closing" and "Editing" (including "Transition", "Logging work",
> > etc.).
> > > I think we should keep the "Contributor" role, but we could be (as you
> > > propose) make it more restrictive. Maybe "invite only" for people who
> are
> > > apparently active in our Jira.
> > >
> > > Best,
> > > Robert
> > >
> > >
> > >
> > > On Wed, Mar 6, 2019 at 11:02 AM ZiLi Chen 
> wrote:
> > >
> > >> Hi devs,
> > >>
> > >> Just now I find that one not a contributor can file issue and
> > participant
> > >> discussion.
> > >> One becomes contributor can additionally assign an issue to a person
> and
> > >> modify fields of any issues.
> > >>
> > >> For a more restrictive JIRA workflow, maybe we achieve it by making
> it a
> > >> bit more
> > >> restrictive granting contributor permission?
> > >>
> > >> Best,
> > >> tison.
> > >>
> > >>
> > >> Robert Metzger  于2019年2月27日周三 下午9:53写道:
> > >>
> > >>> I like this idea and I would like to try it to see if it solves the
> > >>> problem.
> > >>>
> > >>> I can also offer to add a functionality to the Flinkbot to
> > automatically
> > >>> close pull requests which have been opened against a unassigned JIRA
> > >>> ticket.
> > >>> Being rejected by an automated system, which just applies a rule is
> > nicer
> > >>> than being rejected by a person.
> > >>>
> > >>>
> > >>> On Wed, Feb 27, 2019 at 1:45 PM Stephan Ewen 
> wrote:
> > >>>
> >  @Chesnay - yes, this is possible, according to infra.
> > 
> >  On Wed, Feb 27, 2019 at 11:09 AM ZiLi Chen 
> > >> wrote:
> > > Hi,
> > >
> > > @Hequn
> > > It might be hard to separate JIRAs into conditional and
> unconditional
> >  ones.
> > > Even if INFRA supports such separation, we meet the problem that
> > >>> whether
> > > a contributor is granted to decide the type of a JIRA. If so, then
> > > contributors might
> > > tend to create JIRAs as unconditional; and if not, we fallback
> that a
> > > contributor
> > > ask a committer for setting the JIRA as unconditional, which is no
> > >>> better
> > > than
> > > ask a committer for assigning to the contributor.
> > >
> > > @Timo
> > > "More discussion before opening a PR" sounds good. However, it
> > >> requires
> > > more
> > > effort/participation from committer's side. From my own side, it's
> >  exciting
> > > to
> > > see our committers become more active :-)
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Chesnay Schepler  于2019年2月27日周三 下午5:06写道:
> > >
> > >> We currently cannot change the JIRA permissions. Have you asked
> > >> INFRA
> > >> whether it is possible to setup a Flink-specific permission
> scheme?
> > >>
> > >> On 25.02.2019 14:23, Timo Walther wrote:
> > >>> Hi everyone,
> > >>>
> > >>> as some of you might have noticed during the last weeks, the
> > >> Flink
> > >>> community grew quite a bit. A lot of people have applied for
> > >>> contributor permissions and started working on issues, which is
> > >>> gre

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-18 Thread Robert Metzger
Nice, really good news on the INFRA front!
I think the hardware specs sound reasonable. And a periodic backup of the
website's database to Infra's backup solution sounds reasonable too.

Can you accept and review my proposal for the website?


On Sat, Mar 16, 2019 at 3:47 PM Becket Qin  wrote:

> >
> > I have a very capable and motivated frontend developer who would be
> > willing to implement what I've mocked in my proposal.
>
>
> That is awesome!
>
> I created a Jira ticket[1] to Apache Infra and got the reply. It looks that
> Apache infra team could provide a decent VM. The last piece is how to
> ensure the data is persisted so we won't lose the project info / user
> feedbacks when the VM is down. If Apache infra does not provide a
> persistent storage for DB backup, we can always ask for multiple VMs and do
> the fault tolerance by ourselves. It seems we can almost say the hardware
> side is also ready.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> [1] https://issues.apache.org/jira/browse/INFRA-18010
>
> On Fri, Mar 15, 2019 at 5:39 PM Robert Metzger 
> wrote:
>
> > Thank you for reaching out to Infra and the ember client.
> > When I first saw the Ember repository, I thought it is the whole thing
> > (frontend and backend), but while testing it, I realized it is "only" the
> > frontend. I'm not sure if it makes sense to adjust the Ember observer
> > client, or just write a simple UI from scratch.
> > I have a very capable and motivated frontend developer who would be
> > willing to implement what I've mocked in my proposal.
> > In addition, I found somebody (Congxian Qiu) who seems to be eager to
> help
> > with this project for the backend:
> > https://github.com/rmetzger/flink-community-tools/issues/4
> >
> > For Infra: I made the same experience when asking for more GitHub
> > permissions for "flinkbot": They didn't respond on their mailing list,
> only
> > on Jira.
> >
> >
> >
> > On Thu, Mar 14, 2019 at 2:45 PM Becket Qin  wrote:
> >
> >> Thanks for writing up the specifications.
> >>
> >> Regarding the website source code, Austin found a website[1] whose
> >> frontend code[2] is available publicly. It lacks some support (e.g
> login),
> >> but it is still a good starting point. One thing is that I did not find
> a
> >> License statement for that source code. I'll reach out to the author to
> see
> >> if they have any concern over our usage.
> >>
> >> Apache Infra has not replied to my email regarding some details about
> the
> >> VM. I'll open an infra Jira ticket tomorrow if there is still no
> response.
> >>
> >> Thanks,
> >>
> >> Jiangjie (Becket) Qin
> >>
> >> [1] https://emberobserver.com/
> >> [2] https://github.com/emberobserver/client
> >>
> >>
> >>
> >> On Thu, Mar 14, 2019 at 1:35 AM Robert Metzger 
> >> wrote:
> >>
> >>> @Bowen: I agree. Confluent Hub looks nicer, but it is on their company
> >>> website. I guess the likelihood that they give out code from their
> company
> >>> website is fairly low.
> >>> @Nils: Beam's page is similar to our Ecosystem page, which we'll
> >>> reactivate as part of this PR:
> >>> https://github.com/apache/flink-web/pull/187
> >>>
> >>> Spark-packages.org did not respond to my request.
> >>> I will propose a short specification in Becket's initial document.
> >>>
> >>>
> >>> On Mon, Mar 11, 2019 at 11:38 AM Niels Basjes  wrote:
> >>>
>  Hi,
> 
>  The Beam project has something in this area that is simply a page
>  within their documentation website:
>  https://beam.apache.org/documentation/sdks/java-thirdparty/
> 
>  Niels Basjes
> 
>  On Fri, Mar 8, 2019 at 11:39 PM Bowen Li  wrote:
>  >
>  > Confluent hub for Kafka is another good example of this kind. I
>  personally like it over the spark site. May worth checking it out with
>  Kafka folks
>  >
>  > On Thu, Mar 7, 2019 at 6:06 AM Becket Qin 
>  wrote:
>  >>
>  >> Absolutely! Thanks for the pointer. I'll submit a PR to update the
>  >> ecosystem page and the navigation.
>  >>
>  >> Thanks,
>  >>
>  >> Jiangjie (Becket) Qin
>  >>
>  >> On Thu, Mar 7, 2019 at 8:47 PM Robert Metzger  >
>  wrote:
>  >>
>  >> > Okay. I will reach out to spark-packages.org and see if they are
>  willing
>  >> > to share.
>  >> >
>  >> > Do you want to raise a PR to update the ecosystem page (maybe
> sync
>  with
>  >> > the "Software Projects" listed here:
>  >> >
> https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink)
>  and
>  >> > link it in the navigation?
>  >> >
>  >> > Best,
>  >> > Robert
>  >> >
>  >> >
>  >> > On Thu, Mar 7, 2019 at 10:13 AM Becket Qin  >
>  wrote:
>  >> >
>  >> >> Hi Robert,
>  >> >>
>  >> >> I think it at least worths checking if spark-packages.org
> owners
>  are
>  >> >> willing to share. Thanks for volunteering to write the
> requirement
>  >> >> descriptions! In any case, that wil

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Chesnay Schepler
Long term plan _is_ to move flink-shaded-hadoop to flink-shaded, I 
believe there's even a JIRA for that.


Until that is in place they _must_ have retain the flink version as 
otherwise we'd be unable to change them in follow-up releases without 
changing the version scheme again.


And even after the move they will retain the flink-shaded version like 
all other flink-shaded modules, for the above reason.


On 18.03.2019 12:10, jincheng sun wrote:

Hi Chesnay,

The artifacts to be released do not have a SNAPSHOT suffix:

https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/

Thank you for providing this link. It's very useful for contributors 
who want to check the RC on YARN.


My suggestion may not describe being clear, let me explain:

1. Since 1.8.0, Flink's release package will not contain the 
corresponding Hadoop dependency, then the user has two ways to get the 
required hadoop dependency:


   1). Download the existing Hadoop version on the Flink download page.
   2). Generate the version required by the user from the source code 
(see 
https://ci.apache.org/projects/flink/flink-docs-master/flinkDev/building.html#hadoop-versions) 
For example, version 2.6.1 is required: `mvn clean install -DskipTests 
-Dhadoop.version=2.6.1`.


2. About how to manage the JARs release of Hadoop dependencies:

   1). The name of Hadoop shaded version should not include Flink 
version,  take your link as an example:

 `.../flink-shaded-hadoop2-uber/2.4.1-1.8.0/xx.jar`
 `.../flink-shaded-hadoop2-uber/2.6.5-1.8.0/xx.jar`
 `.../flink-shaded-hadoop2-uber/2.7.5-1.8.0/xx.jar`
 `.../flink-shaded-hadoop2-uber/2.8.3-1.8.0/xx.jar`
The above version name I think it is possible to change `2.4.1-1.8.0` 
to `2.4.1`. That is, the same version of `Hadoop` shade can be used in 
many Flink versions, such as 2.8.3 Hadoop is not only available for 
Flink-1.8.0, it can be used by Flink-1.8.x or it can be used by 
Flink-1.9.x. etc.


   2). Release the shaded-Hadoop independently:
   For a long-term,  we can release the shaded JARs independently and 
move `flink-shaded-hadoop` into 
`https://github.com/apache/flink-shaded` 
,  So I suggest that we can 
publish Hadoop versions independently,  and share them in multiple 
Flink versions.

What do you think?

Best,
Jincheng


Chesnay Schepler mailto:ches...@apache.org>> 
于2019年3月18日周一 下午4:15写道:


We release SNAPSHOT artifacts for all module, see

https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/

.

The artifacts to be released do not have a SNAPSHOT suffix:

https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/

Finally, we are already adding flink-shaded-hadoop to the optional
components section in this PR:
https://github.com/apache/flink-web/pull/180

On 18.03.2019 08:55, jincheng sun wrote:
> -1
>
> Currently, we have released the Hadoop-related JRA as a snapshot
> version(such as flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
>

),
> I think we should release a stable version.
> When testing the release code on YARN, currently user cannot
find out the
> Hadoop dependency.  Although there is a download explanation for
Hadoop in
> PR [`Update Downloads page for Flink 1.8
> `], a 404
error occurs
> when you click Download ( I had left detail comments in the PR).
>
> So, I suggest as follows:
>
>1. It would be better to add the changes for
> `downloads.html#optional-components`, add the Hadoop relation
JARs download
> link first.
>2. Then add instructions on how to get the dependencies of
the Hadoop or
> add the correct download link directly in the next VOTE mail,
due to we do
> not include Hadoop in `flink-dist`.
>3.  Release a stable version Hadoop-related JRAs.
>
> Then, contributors can test it more easily on YARN.  What do you
think?
>
> Best,
> Jincheng
>
>
> Chesnay Schepler mailto:ches...@apache.org>> 于2019年3月15日周五 下午10:35写道:
>
>> -1
>>
>> Missing dependencies in NOTICE file of flink-dist (and by
extension the
>> binary distribution).
>> * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
>>
>> On 14.03.2019 13:42, Aljoscha Krettek wrote:
>>> Hi everyone,
>>> Please review and vote on the release candidate 2 for Flink
1.8.0, as
>> follows:
>>> [ ] +1, Approve the release
>>> [ ] -1, Do not approve the release (please provide specific
comments)
>>>
>>>
>>> The complete staging area is available for your review, which
includes:
>>> * JIRA release notes [1],
  

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Chesnay Schepler
Additionally, which I fortunately did not realize earlier, we must also 
remove the "org.rocksdb:rocksdbjni" entry from the NOTICE files. (i.e. 
replace them with frocksdb)


On 15.03.2019 15:35, Chesnay Schepler wrote:

-1

Missing dependencies in NOTICE file of flink-dist (and by extension 
the binary distribution).

* com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0

On 14.03.2019 13:42, Aljoscha Krettek wrote:

Hi everyone,
Please review and vote on the release candidate 2 for Flink 1.8.0, as 
follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases 
to be deployed to dist.apache.org  [2], 
which are signed with the key with fingerprint 
F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],

* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc2" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by 
majority approval, with at least 3 PMC affirmative votes.


Thanks,
Aljoscha

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 
 

[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ 

[3] https://dist.apache.org/repos/dist/release/flink/KEYS 

[4] 
https://repository.apache.org/content/repositories/orgapacheflink-1213 
 

[5] 
https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd 
 

[6] https://github.com/apache/flink-web/pull/180 

[7] https://github.com/apache/flink-web/pull/179 



P.S. The difference to the previous RC1 is very small, you can fetch 
the two tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” 
to see the difference in commits. Its fixes for the issues that led 
to the cancellation of the previous RC plus smaller fixes. Most 
verification/testing that was carried out should apply as is to this RC.








Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread jincheng sun
Hi Chesnay,

The artifacts to be released do not have a SNAPSHOT suffix:
>
> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/

Thank you for providing this link. It's very useful for contributors who
want to check the RC on YARN.

My suggestion may not describe being clear, let me explain:

1. Since 1.8.0, Flink's release package will not contain the corresponding
Hadoop dependency, then the user has two ways to get the required hadoop
dependency:

   1). Download the existing Hadoop version on the Flink download page.
   2). Generate the version required by the user from the source code (see
https://ci.apache.org/projects/flink/flink-docs-master/flinkDev/building.html#hadoop-versions)
For example, version 2.6.1 is required: `mvn clean install -DskipTests
-Dhadoop.version=2.6.1`.

2. About how to manage the JARs release of Hadoop dependencies:

   1). The name of Hadoop shaded version should not include Flink version,
take your link as an example:
   `.../flink-shaded-hadoop2-uber/2.4.1-1.8.0/xx.jar`
   `.../flink-shaded-hadoop2-uber/2.6.5-1.8.0/xx.jar`
   `.../flink-shaded-hadoop2-uber/2.7.5-1.8.0/xx.jar`
   `.../flink-shaded-hadoop2-uber/2.8.3-1.8.0/xx.jar`
The above version name I think it is possible to change `2.4.1-1.8.0` to
`2.4.1`. That is, the same version of `Hadoop` shade can be used in many
Flink versions, such as 2.8.3 Hadoop is not only available for Flink-1.8.0,
it can be used by Flink-1.8.x or it can be used by Flink-1.9.x. etc.

   2). Release the shaded-Hadoop independently:
   For a long-term,  we can release the shaded JARs independently and move
`flink-shaded-hadoop` into `https://github.com/apache/flink-shaded`,  So I
suggest that we can publish Hadoop versions independently,  and share them
in multiple Flink versions.

What do you think?

Best,
Jincheng


Chesnay Schepler  于2019年3月18日周一 下午4:15写道:

> We release SNAPSHOT artifacts for all module, see
>
> https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/
> .
>
> The artifacts to be released do not have a SNAPSHOT suffix:
>
> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/
>
> Finally, we are already adding flink-shaded-hadoop to the optional
> components section in this PR:
> https://github.com/apache/flink-web/pull/180
>
> On 18.03.2019 08:55, jincheng sun wrote:
> > -1
> >
> > Currently, we have released the Hadoop-related JRA as a snapshot
> > version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
> > <
> https://repository.apache.org/content/groups/public/org/apache/flink/flink-shaded-hadoop2-uber/
> >),
> > I think we should release a stable version.
> > When testing the release code on YARN, currently user cannot find out the
> > Hadoop dependency.  Although there is a download explanation for Hadoop
> in
> > PR [`Update Downloads page for Flink 1.8
> > `], a 404 error
> occurs
> > when you click Download ( I had left detail comments in the PR).
> >
> > So, I suggest as follows:
> >
> >1. It would be better to add the changes for
> > `downloads.html#optional-components`, add the Hadoop relation JARs
> download
> > link first.
> >2. Then add instructions on how to get the dependencies of the Hadoop
> or
> > add the correct download link directly in the next VOTE mail, due to we
> do
> > not include Hadoop in `flink-dist`.
> >3.  Release a stable version Hadoop-related JRAs.
> >
> > Then, contributors can test it more easily on YARN.  What do you think?
> >
> > Best,
> > Jincheng
> >
> >
> > Chesnay Schepler  于2019年3月15日周五 下午10:35写道:
> >
> >> -1
> >>
> >> Missing dependencies in NOTICE file of flink-dist (and by extension the
> >> binary distribution).
> >> * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
> >>
> >> On 14.03.2019 13:42, Aljoscha Krettek wrote:
> >>> Hi everyone,
> >>> Please review and vote on the release candidate 2 for Flink 1.8.0, as
> >> follows:
> >>> [ ] +1, Approve the release
> >>> [ ] -1, Do not approve the release (please provide specific comments)
> >>>
> >>>
> >>> The complete staging area is available for your review, which includes:
> >>> * JIRA release notes [1],
> >>> * the official Apache source release and binary convenience releases to
> >> be deployed to dist.apache.org  [2], which are
> >> signed with the key with fingerprint
> >> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> >>> * all artifacts to be deployed to the Maven Central Repository [4],
> >>> * source code tag "release-1.8.0-rc2" [5],
> >>> * website pull request listing the new release [6]
> >>> * website pull request adding announcement blog post [7].
> >>>
> >>> The vote will be open for at least 72 hours. It is adopted by majority
> >> approval, with at least 3 PMC affirmative votes.
> >>> Thanks,
> >>> Aljoscha
> >>>
> >>> [1]
> >>
> https://issues.apache.org/jira/sec

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Aljoscha Krettek
@Yu Thanks for the pointer. This is because I didn’t yet update the buildbot 
configuration for the new release. It’s a point that is very low in the release 
guide but I think I’ll do that now.

> On 18. Mar 2019, at 09:37, Yu Li  wrote:
> 
> One supplement for point #2: there's a Note on the doc for the error, but
> I'm wondering why we don't directly remove the -DarchetypeCatalog option in
> the command and tell users to specify the catalog in settings.xml if they
> prefer to. I mean, user tends to try the command first before checking the
> note and will get the error. Thanks.
> 
> Best Regards,
> Yu
> 
> 
> On Mon, 18 Mar 2019 at 16:30, Yu Li  wrote:
> 
>> Issues observed when checking quick start:
>> 
>> 1. The versions on the document
>> 
>>  are
>> still "1.9-SNAPSHOT" instead of "1.8.0"
>> 
>> 2. The "Use Maven archetypes" command failed with below error:
>> [ERROR] Failed to execute goal
>> org.apache.maven.plugins:maven-archetype-plugin:3.0.1:generate
>> (default-cli) on project standalone-pom: archetypeCatalog '
>> https://repository.apache.org/content/repositories/snapshots/' is not
>> supported anymore. Please read the plugin documentation for details.
>> 
>> Best Regards,
>> Yu
>> 
>> 
>> On Mon, 18 Mar 2019 at 16:15, Chesnay Schepler  wrote:
>> 
>>> We release SNAPSHOT artifacts for all module, see
>>> 
>>> https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/
>>> .
>>> 
>>> The artifacts to be released do not have a SNAPSHOT suffix:
>>> 
>>> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/
>>> 
>>> Finally, we are already adding flink-shaded-hadoop to the optional
>>> components section in this PR:
>>> https://github.com/apache/flink-web/pull/180
>>> 
>>> On 18.03.2019 08:55, jincheng sun wrote:
 -1
 
 Currently, we have released the Hadoop-related JRA as a snapshot
 version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
 <
>>> https://repository.apache.org/content/groups/public/org/apache/flink/flink-shaded-hadoop2-uber/
 ),
 I think we should release a stable version.
 When testing the release code on YARN, currently user cannot find out
>>> the
 Hadoop dependency.  Although there is a download explanation for Hadoop
>>> in
 PR [`Update Downloads page for Flink 1.8
 `], a 404 error
>>> occurs
 when you click Download ( I had left detail comments in the PR).
 
 So, I suggest as follows:
 
   1. It would be better to add the changes for
 `downloads.html#optional-components`, add the Hadoop relation JARs
>>> download
 link first.
   2. Then add instructions on how to get the dependencies of the
>>> Hadoop or
 add the correct download link directly in the next VOTE mail, due to we
>>> do
 not include Hadoop in `flink-dist`.
   3.  Release a stable version Hadoop-related JRAs.
 
 Then, contributors can test it more easily on YARN.  What do you think?
 
 Best,
 Jincheng
 
 
 Chesnay Schepler  于2019年3月15日周五 下午10:35写道:
 
> -1
> 
> Missing dependencies in NOTICE file of flink-dist (and by extension the
> binary distribution).
> * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
> 
> On 14.03.2019 13:42, Aljoscha Krettek wrote:
>> Hi everyone,
>> Please review and vote on the release candidate 2 for Flink 1.8.0, as
> follows:
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific comments)
>> 
>> 
>> The complete staging area is available for your review, which
>>> includes:
>> * JIRA release notes [1],
>> * the official Apache source release and binary convenience releases
>>> to
> be deployed to dist.apache.org  [2], which
>>> are
> signed with the key with fingerprint
> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
>> * all artifacts to be deployed to the Maven Central Repository [4],
>> * source code tag "release-1.8.0-rc2" [5],
>> * website pull request listing the new release [6]
>> * website pull request adding announcement blog post [7].
>> 
>> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>> Thanks,
>> Aljoscha
>> 
>> [1]
> 
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> <
> 
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
> https://di

[jira] [Created] (FLINK-11956) Remove shading from filesystems build

2019-03-18 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-11956:
--

 Summary: Remove shading from filesystems build
 Key: FLINK-11956
 URL: https://issues.apache.org/jira/browse/FLINK-11956
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / FileSystem
Affects Versions: 1.9.0
Reporter: Stefan Richter






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11955) Modify build to move filesystems from lib to plugins folder

2019-03-18 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-11955:
--

 Summary: Modify build to move filesystems from lib to plugins 
folder
 Key: FLINK-11955
 URL: https://issues.apache.org/jira/browse/FLINK-11955
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / FileSystem
Affects Versions: 1.9.0
Reporter: Stefan Richter






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11953) Introduce Plugin/Loading system and integrate it with FileSystem

2019-03-18 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-11953:
--

 Summary: Introduce Plugin/Loading system and integrate it with 
FileSystem
 Key: FLINK-11953
 URL: https://issues.apache.org/jira/browse/FLINK-11953
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / FileSystem
Affects Versions: 1.9.0
Reporter: Stefan Richter






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11954) Provide unit test(s) for new plugin infrastructure

2019-03-18 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-11954:
--

 Summary: Provide unit test(s) for new plugin infrastructure
 Key: FLINK-11954
 URL: https://issues.apache.org/jira/browse/FLINK-11954
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / FileSystem
Affects Versions: 1.9.0
Reporter: Stefan Richter






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[REMINDER] Please add entries for newly added dependencies to NOTICE file

2019-03-18 Thread Aljoscha Krettek
Hi All,

Please remember to add newly added dependencies to the NOTICE file of 
flink-dist (which will then end up in NOTICE-binary and so on). Discovering 
this late will cause delays in releases, as it is doing now.

There is a handy guide that Chesnay and Till worked on that explains licensing 
for Apache projects and Flink specifically: 
https://cwiki.apache.org/confluence/display/FLINK/Licensing 


Best,
Aljoscha

[jira] [Created] (FLINK-11952) Introduce Plugin Architecture for loading FileSystems

2019-03-18 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-11952:
--

 Summary: Introduce Plugin Architecture for loading FileSystems
 Key: FLINK-11952
 URL: https://issues.apache.org/jira/browse/FLINK-11952
 Project: Flink
  Issue Type: New Feature
  Components: Connectors / FileSystem
Affects Versions: 1.9.0
Reporter: Stefan Richter
Assignee: Stefan Richter


We want to change the general architecture for loading FileSystems in Flink to 
a plugin architecture. The advantage of this change is that it would invert the 
classloading from parent-first to child-first and therefore enables us to move 
away from shading to avoid class/version conflics.

Note that this general approach could also be used in other places for Flink in 
the future, but this task is targetting only the file systems for now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11951) Enhance UserDefinedFunction interface to allow more user defined types

2019-03-18 Thread Jingsong Lee (JIRA)
Jingsong Lee created FLINK-11951:


 Summary: Enhance UserDefinedFunction interface to allow more user 
defined types
 Key: FLINK-11951
 URL: https://issues.apache.org/jira/browse/FLINK-11951
 Project: Flink
  Issue Type: Improvement
Reporter: Jingsong Lee
Assignee: Jingsong Lee


1.Allow UDF & UDTF to access constant parameter values at getReturnType, see 
similar feature in hive: 
[https://hive.apache.org/javadocs/r2.2.0/api/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.html#initializeAndFoldConstants-org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector:A-]

2.Allow AggregateFunction to decide its user define inputs types with 
argClasses.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Yu Li
One supplement for point #2: there's a Note on the doc for the error, but
I'm wondering why we don't directly remove the -DarchetypeCatalog option in
the command and tell users to specify the catalog in settings.xml if they
prefer to. I mean, user tends to try the command first before checking the
note and will get the error. Thanks.

Best Regards,
Yu


On Mon, 18 Mar 2019 at 16:30, Yu Li  wrote:

> Issues observed when checking quick start:
>
> 1. The versions on the document
> 
>  are
> still "1.9-SNAPSHOT" instead of "1.8.0"
>
> 2. The "Use Maven archetypes" command failed with below error:
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-archetype-plugin:3.0.1:generate
> (default-cli) on project standalone-pom: archetypeCatalog '
> https://repository.apache.org/content/repositories/snapshots/' is not
> supported anymore. Please read the plugin documentation for details.
>
> Best Regards,
> Yu
>
>
> On Mon, 18 Mar 2019 at 16:15, Chesnay Schepler  wrote:
>
>> We release SNAPSHOT artifacts for all module, see
>>
>> https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/
>> .
>>
>> The artifacts to be released do not have a SNAPSHOT suffix:
>>
>> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/
>>
>> Finally, we are already adding flink-shaded-hadoop to the optional
>> components section in this PR:
>> https://github.com/apache/flink-web/pull/180
>>
>> On 18.03.2019 08:55, jincheng sun wrote:
>> > -1
>> >
>> > Currently, we have released the Hadoop-related JRA as a snapshot
>> > version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
>> > <
>> https://repository.apache.org/content/groups/public/org/apache/flink/flink-shaded-hadoop2-uber/
>> >),
>> > I think we should release a stable version.
>> > When testing the release code on YARN, currently user cannot find out
>> the
>> > Hadoop dependency.  Although there is a download explanation for Hadoop
>> in
>> > PR [`Update Downloads page for Flink 1.8
>> > `], a 404 error
>> occurs
>> > when you click Download ( I had left detail comments in the PR).
>> >
>> > So, I suggest as follows:
>> >
>> >1. It would be better to add the changes for
>> > `downloads.html#optional-components`, add the Hadoop relation JARs
>> download
>> > link first.
>> >2. Then add instructions on how to get the dependencies of the
>> Hadoop or
>> > add the correct download link directly in the next VOTE mail, due to we
>> do
>> > not include Hadoop in `flink-dist`.
>> >3.  Release a stable version Hadoop-related JRAs.
>> >
>> > Then, contributors can test it more easily on YARN.  What do you think?
>> >
>> > Best,
>> > Jincheng
>> >
>> >
>> > Chesnay Schepler  于2019年3月15日周五 下午10:35写道:
>> >
>> >> -1
>> >>
>> >> Missing dependencies in NOTICE file of flink-dist (and by extension the
>> >> binary distribution).
>> >> * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
>> >>
>> >> On 14.03.2019 13:42, Aljoscha Krettek wrote:
>> >>> Hi everyone,
>> >>> Please review and vote on the release candidate 2 for Flink 1.8.0, as
>> >> follows:
>> >>> [ ] +1, Approve the release
>> >>> [ ] -1, Do not approve the release (please provide specific comments)
>> >>>
>> >>>
>> >>> The complete staging area is available for your review, which
>> includes:
>> >>> * JIRA release notes [1],
>> >>> * the official Apache source release and binary convenience releases
>> to
>> >> be deployed to dist.apache.org  [2], which
>> are
>> >> signed with the key with fingerprint
>> >> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
>> >>> * all artifacts to be deployed to the Maven Central Repository [4],
>> >>> * source code tag "release-1.8.0-rc2" [5],
>> >>> * website pull request listing the new release [6]
>> >>> * website pull request adding announcement blog post [7].
>> >>>
>> >>> The vote will be open for at least 72 hours. It is adopted by majority
>> >> approval, with at least 3 PMC affirmative votes.
>> >>> Thanks,
>> >>> Aljoscha
>> >>>
>> >>> [1]
>> >>
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
>> >> <
>> >>
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
>> >>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
>> >> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
>> >>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
>> >> https://dist.apache.org/repos/dist/release/flink/KEYS>
>> >>> [4]
>> >> https://repository.apache.org/content/repositories/orgapacheflink-1213
>> <
>> >>
>> https://repository.apache.org/content/repositories/orgapacheflink-1210/>
>> >>> [5]
>> >>
>> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Yu Li
Issues observed when checking quick start:

1. The versions on the document

are
still "1.9-SNAPSHOT" instead of "1.8.0"

2. The "Use Maven archetypes" command failed with below error:
[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-archetype-plugin:3.0.1:generate
(default-cli) on project standalone-pom: archetypeCatalog '
https://repository.apache.org/content/repositories/snapshots/' is not
supported anymore. Please read the plugin documentation for details.

Best Regards,
Yu


On Mon, 18 Mar 2019 at 16:15, Chesnay Schepler  wrote:

> We release SNAPSHOT artifacts for all module, see
>
> https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/
> .
>
> The artifacts to be released do not have a SNAPSHOT suffix:
>
> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/
>
> Finally, we are already adding flink-shaded-hadoop to the optional
> components section in this PR:
> https://github.com/apache/flink-web/pull/180
>
> On 18.03.2019 08:55, jincheng sun wrote:
> > -1
> >
> > Currently, we have released the Hadoop-related JRA as a snapshot
> > version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
> > <
> https://repository.apache.org/content/groups/public/org/apache/flink/flink-shaded-hadoop2-uber/
> >),
> > I think we should release a stable version.
> > When testing the release code on YARN, currently user cannot find out the
> > Hadoop dependency.  Although there is a download explanation for Hadoop
> in
> > PR [`Update Downloads page for Flink 1.8
> > `], a 404 error
> occurs
> > when you click Download ( I had left detail comments in the PR).
> >
> > So, I suggest as follows:
> >
> >1. It would be better to add the changes for
> > `downloads.html#optional-components`, add the Hadoop relation JARs
> download
> > link first.
> >2. Then add instructions on how to get the dependencies of the Hadoop
> or
> > add the correct download link directly in the next VOTE mail, due to we
> do
> > not include Hadoop in `flink-dist`.
> >3.  Release a stable version Hadoop-related JRAs.
> >
> > Then, contributors can test it more easily on YARN.  What do you think?
> >
> > Best,
> > Jincheng
> >
> >
> > Chesnay Schepler  于2019年3月15日周五 下午10:35写道:
> >
> >> -1
> >>
> >> Missing dependencies in NOTICE file of flink-dist (and by extension the
> >> binary distribution).
> >> * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
> >>
> >> On 14.03.2019 13:42, Aljoscha Krettek wrote:
> >>> Hi everyone,
> >>> Please review and vote on the release candidate 2 for Flink 1.8.0, as
> >> follows:
> >>> [ ] +1, Approve the release
> >>> [ ] -1, Do not approve the release (please provide specific comments)
> >>>
> >>>
> >>> The complete staging area is available for your review, which includes:
> >>> * JIRA release notes [1],
> >>> * the official Apache source release and binary convenience releases to
> >> be deployed to dist.apache.org  [2], which are
> >> signed with the key with fingerprint
> >> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> >>> * all artifacts to be deployed to the Maven Central Repository [4],
> >>> * source code tag "release-1.8.0-rc2" [5],
> >>> * website pull request listing the new release [6]
> >>> * website pull request adding announcement blog post [7].
> >>>
> >>> The vote will be open for at least 72 hours. It is adopted by majority
> >> approval, with at least 3 PMC affirmative votes.
> >>> Thanks,
> >>> Aljoscha
> >>>
> >>> [1]
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> >> <
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> >>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
> >> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
> >>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
> >> https://dist.apache.org/repos/dist/release/flink/KEYS>
> >>> [4]
> >> https://repository.apache.org/content/repositories/orgapacheflink-1213
> <
> >> https://repository.apache.org/content/repositories/orgapacheflink-1210/
> >
> >>> [5]
> >>
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd
> >> <
> >>
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=2d00b1c26d7b4554707063ab0d1d6cc236cfe8a5
> >>> [6] https://github.com/apache/flink-web/pull/180 <
> >> https://github.com/apache/flink-web/pull/180>
> >>> [7] https://github.com/apache/flink-web/pull/179 <
> >> https://github.com/apache/flink-web/pull/179>
> >>> P.S. The difference to the previous RC1 is very small, you can fetch
> the
> >> two tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” to see
> the
> >> difference in commits. Its fixes for the issues th

[jira] [Created] (FLINK-11950) Add missing dependencies in NOTICE file of flink-dist.

2019-03-18 Thread sunjincheng (JIRA)
sunjincheng created FLINK-11950:
---

 Summary: Add missing dependencies in NOTICE file of flink-dist.
 Key: FLINK-11950
 URL: https://issues.apache.org/jira/browse/FLINK-11950
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.8.0, 1.9.0
Reporter: sunjincheng
Assignee: sunjincheng


Add Missing dependencies in NOTICE file of flink-dist. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Chesnay Schepler
We release SNAPSHOT artifacts for all module, see 
https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/ 
.


The artifacts to be released do not have a SNAPSHOT suffix: 
https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/


Finally, we are already adding flink-shaded-hadoop to the optional 
components section in this PR: https://github.com/apache/flink-web/pull/180


On 18.03.2019 08:55, jincheng sun wrote:

-1

Currently, we have released the Hadoop-related JRA as a snapshot
version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
),
I think we should release a stable version.
When testing the release code on YARN, currently user cannot find out the
Hadoop dependency.  Although there is a download explanation for Hadoop in
PR [`Update Downloads page for Flink 1.8
`], a 404 error occurs
when you click Download ( I had left detail comments in the PR).

So, I suggest as follows:

   1. It would be better to add the changes for
`downloads.html#optional-components`, add the Hadoop relation JARs download
link first.
   2. Then add instructions on how to get the dependencies of the Hadoop or
add the correct download link directly in the next VOTE mail, due to we do
not include Hadoop in `flink-dist`.
   3.  Release a stable version Hadoop-related JRAs.

Then, contributors can test it more easily on YARN.  What do you think?

Best,
Jincheng


Chesnay Schepler  于2019年3月15日周五 下午10:35写道:


-1

Missing dependencies in NOTICE file of flink-dist (and by extension the
binary distribution).
* com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0

On 14.03.2019 13:42, Aljoscha Krettek wrote:

Hi everyone,
Please review and vote on the release candidate 2 for Flink 1.8.0, as

follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to

be deployed to dist.apache.org  [2], which are
signed with the key with fingerprint
F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],

* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc2" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority

approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1]

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
<
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274

[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <

https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>

[3] https://dist.apache.org/repos/dist/release/flink/KEYS <

https://dist.apache.org/repos/dist/release/flink/KEYS>

[4]

https://repository.apache.org/content/repositories/orgapacheflink-1213 <
https://repository.apache.org/content/repositories/orgapacheflink-1210/>

[5]

https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd
<
https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=2d00b1c26d7b4554707063ab0d1d6cc236cfe8a5

[6] https://github.com/apache/flink-web/pull/180 <

https://github.com/apache/flink-web/pull/180>

[7] https://github.com/apache/flink-web/pull/179 <

https://github.com/apache/flink-web/pull/179>

P.S. The difference to the previous RC1 is very small, you can fetch the

two tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” to see the
difference in commits. Its fixes for the issues that led to the
cancellation of the previous RC plus smaller fixes. Most
verification/testing that was carried out should apply as is to this RC.







Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread jincheng sun
-1

Currently, we have released the Hadoop-related JRA as a snapshot
version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
),
I think we should release a stable version.
When testing the release code on YARN, currently user cannot find out the
Hadoop dependency.  Although there is a download explanation for Hadoop in
PR [`Update Downloads page for Flink 1.8
`], a 404 error occurs
when you click Download ( I had left detail comments in the PR).

So, I suggest as follows:

  1. It would be better to add the changes for
`downloads.html#optional-components`, add the Hadoop relation JARs download
link first.
  2. Then add instructions on how to get the dependencies of the Hadoop or
add the correct download link directly in the next VOTE mail, due to we do
not include Hadoop in `flink-dist`.
  3.  Release a stable version Hadoop-related JRAs.

Then, contributors can test it more easily on YARN.  What do you think?

Best,
Jincheng


Chesnay Schepler  于2019年3月15日周五 下午10:35写道:

> -1
>
> Missing dependencies in NOTICE file of flink-dist (and by extension the
> binary distribution).
> * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
>
> On 14.03.2019 13:42, Aljoscha Krettek wrote:
> > Hi everyone,
> > Please review and vote on the release candidate 2 for Flink 1.8.0, as
> follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release and binary convenience releases to
> be deployed to dist.apache.org  [2], which are
> signed with the key with fingerprint
> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag "release-1.8.0-rc2" [5],
> > * website pull request listing the new release [6]
> > * website pull request adding announcement blog post [7].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
> >
> > Thanks,
> > Aljoscha
> >
> > [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> <
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> >
> > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
> https://dist.apache.org/repos/dist/release/flink/KEYS>
> > [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1213 <
> https://repository.apache.org/content/repositories/orgapacheflink-1210/>
> > [5]
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd
> <
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=2d00b1c26d7b4554707063ab0d1d6cc236cfe8a5
> >
> > [6] https://github.com/apache/flink-web/pull/180 <
> https://github.com/apache/flink-web/pull/180>
> > [7] https://github.com/apache/flink-web/pull/179 <
> https://github.com/apache/flink-web/pull/179>
> >
> > P.S. The difference to the previous RC1 is very small, you can fetch the
> two tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” to see the
> difference in commits. Its fixes for the issues that led to the
> cancellation of the previous RC plus smaller fixes. Most
> verification/testing that was carried out should apply as is to this RC.
>
>
>