[jira] [Created] (FLINK-22022) Reduce the ExecNode scan scope to improve performance when converting json plan to ExecNodeGraph

2021-03-29 Thread godfrey he (Jira)
godfrey he created FLINK-22022:
--

 Summary: Reduce the ExecNode scan scope to improve performance 
when converting json plan to ExecNodeGraph
 Key: FLINK-22022
 URL: https://issues.apache.org/jira/browse/FLINK-22022
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: godfrey he
 Fix For: 1.13.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22021) PushFilterIntoLegacyTableSourceScanRule fails to deal with INTERVAL types

2021-03-29 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-22021:
---

 Summary: PushFilterIntoLegacyTableSourceScanRule fails to deal 
with INTERVAL types
 Key: FLINK-22021
 URL: https://issues.apache.org/jira/browse/FLINK-22021
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.13.0
Reporter: Caizhi Weng
 Fix For: 1.13.0


Add the following test case to {{PushFilterIntoLegacyTableSourceScanRuleTest}} 
to reproduce this bug:

{code:scala}
@Test
  def testWithInterval(): Unit = {
val schema = TableSchema
  .builder()
  .field("a", DataTypes.STRING)
  .field("b", DataTypes.STRING)
  .build()

val data = List(Row.of("2021-03-30 10:00:00", "2021-03-30 11:00:00"))
TestLegacyFilterableTableSource.createTemporaryTable(
  util.tableEnv,
  schema,
  "MTable",
  isBounded = true,
  data,
  List("a", "b"))

util.verifyRelPlan(
  """
|SELECT * FROM MTable
|WHERE
|  TIMESTAMPADD(HOUR, 1, TO_TIMESTAMP(a)) >= TO_TIMESTAMP(b)
|  OR
|  TIMESTAMPADD(YEAR, 1, TO_TIMESTAMP(b)) >= TO_TIMESTAMP(a)
|""".stripMargin)
  }
{code}

The exception stack is
{code}
org.apache.flink.table.api.ValidationException: Data type 'INTERVAL SECOND(3) 
NOT NULL' with conversion class 'java.time.Duration' does not support a value 
literal of class 'java.math.BigDecimal'.

at 
org.apache.flink.table.expressions.ValueLiteralExpression.validateValueDataType(ValueLiteralExpression.java:286)
at 
org.apache.flink.table.expressions.ValueLiteralExpression.(ValueLiteralExpression.java:79)
at 
org.apache.flink.table.expressions.ApiExpressionUtils.valueLiteral(ApiExpressionUtils.java:251)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter.visitLiteral(RexNodeExtractor.scala:451)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter.visitLiteral(RexNodeExtractor.scala:359)
at org.apache.calcite.rex.RexLiteral.accept(RexLiteral.java:1173)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter$$anonfun$8.apply(RexNodeExtractor.scala:459)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter$$anonfun$8.apply(RexNodeExtractor.scala:459)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter.visitCall(RexNodeExtractor.scala:458)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter.visitCall(RexNodeExtractor.scala:359)
at org.apache.calcite.rex.RexCall.accept(RexCall.java:174)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter$$anonfun$8.apply(RexNodeExtractor.scala:459)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter$$anonfun$8.apply(RexNodeExtractor.scala:459)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter.visitCall(RexNodeExtractor.scala:458)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter.visitCall(RexNodeExtractor.scala:359)
at org.apache.calcite.rex.RexCall.accept(RexCall.java:174)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter$$anonfun$8.apply(RexNodeExtractor.scala:459)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter$$anonfun$8.apply(RexNodeExtractor.scala:459)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 

[jira] [Created] (FLINK-22020) Reflections ClassNotFound when trying to deserialize a json plan in deployment envrionment

2021-03-29 Thread Wenlong Lyu (Jira)
Wenlong Lyu created FLINK-22020:
---

 Summary: Reflections ClassNotFound when trying to deserialize a 
json plan in deployment envrionment 
 Key: FLINK-22020
 URL: https://issues.apache.org/jira/browse/FLINK-22020
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.13.0
Reporter: Wenlong Lyu






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Releasing Stateful Functions 3.0.0

2021-03-29 Thread Robert Metzger
Wow, the feature list sounds really exciting!

No concerns from my side!

On Thu, Mar 25, 2021 at 1:57 PM Konstantin Knauf  wrote:

> Hi Gordon,
>
> Thank you for the update. +1 for a timely release. For existing Statefun
> users, is there already something in the documentation that describes the
> breaking changes/migration in more detail in order to prepare?
>
> Cheers,
>
> Konstantin
>
> On Thu, Mar 25, 2021 at 9:27 AM Tzu-Li (Gordon) Tai 
> wrote:
>
> > Hi everyone,
> >
> > We'd like to prepare to release StateFun 3.0.0 over the next few days,
> > ideally starting the first release candidate early next week.
> >
> > This is a version bump from 2.x to 3.x, with some major new features and
> > reworks:
> >
> >- New request-reply protocol:
> >the protocol has been reworked as StateFun is moving forward towards a
> >"remote functions first" design. The new protocol enhances StateFun
> > apps to
> >be hot upgraded without restarting the StateFun runtime, including
> >registering new state for functions, and adding new functions to the
> > app.
> >- Cross-language type system:
> >The new protocol also enables a much more ergonomic, cross-language
> type
> >system. This makes it much easier and natural for users to send
> > messages of
> >various types around between their functions (primitive types, or
> custom
> >types such as JSON messages).
> >- Java SDK for remote functions: Going remote first, we've now also
> >added a new Java SDK for remote functions.
> >
> > These are some major features that users would benefit from greatly.
> > Since this release also contains breaking changes, it's nice if we can
> get
> > this out earlier so that new users would not be onboarded with APIs that
> > are going to be immediately deprecated.
> >
> > We're in the final stages of preparing documentation and examples [1] to
> go
> > with the release, and would like to kick off the release candidates early
> > next week.
> >
> > Please let us know if you have any concerns.
> >
> > Thanks,
> > Gordon
> >
> > [1] https://github.com/apache/flink-statefun-playground
> >
>
>
> --
>
> Konstantin Knauf
>
> https://twitter.com/snntrable
>
> https://github.com/knaufk
>


Re: Glob support on file access

2021-03-29 Thread Arvid Heise
Hi Etienne,

In general, any small PR on this subject is very welcome. I don't think
that the community as a whole will invest much into FileInputFormat as the
whole DataSet API is phasing out.

Afaik SQL and Table API are only using InputFormat for the legacy
compatibility layer (e.g. when it comes to translating into DataSet). All
the new batchy stuff is based on BulkFormat and unified source/sink
interface. I'm CC'ing Timo who can correct me if I'm wrong.

So if you just want to add glob support on FileInputFormat /only/ for SQL
and Table API, I don't think it's worth the effort. It would be more
interesting to see if the new FileSource does support it properly and
rather add it there.

On Mon, Mar 29, 2021 at 4:57 PM Etienne Chauchot 
wrote:

> But still this workaround would only work when you have access to the
> underlying /FileInputFormat/. For//SQL and Table APIs, you don't so
> you'll be unable to apply this workaround. So what we could do is make a
> PR to support glob at the FileInputFormat level to profit for all APIs.
>
> I'm gonna do it if everyone agrees.
>
> Best
>
> Etienne Chauchot
>
> On 25/03/2021 13:12, Etienne Chauchot wrote:
> >
> > Hi all,
> >
> > In case it is useful to some of you:
> >
> > I have a big batch that needs to use globs (*.parquet for example) to
> > read input files. It seems that globs do not work out of the box (see
> > https://issues.apache.org/jira/browse/FLINK-6417)
> >
> > But there is a workaround:
> >
> >
> > final  FileInputFormat inputFormat =new  FileInputFormat(new
> Path(extractDir(filePath)));/* or any subclass of FileInputFormat*/
> /*extact parent dir*/
> > inputFormat.setFilesFilter(new
> GlobFilePathFilter(Collections.singletonList(filePath),
> Collections.emptyList()));/*filePath contains glob, the whole path needs to
> be provided to
> > GlobFilePathFilter*/
> > inputFormat.setNestedFileEnumeration(true);
> >
> > Hope, it helps some people
> >
> > Etienne Chauchot
> >
> >
>


Re: Re: Re: [DISCUSS] FLIP-147: Support Checkpoints After Tasks Finished

2021-03-29 Thread Till Rohrmann
Thanks a lot for all your input. To sum up the discussion so far:

## Final checkpoints

We currently agree on favouring a single final checkpoint which can shut
down the topology. In order to support this we need to be able to create a
checkpoint after an operator has finished producing results.

If we want to send checkpoint barriers through the topology this means that
a task must not close the network connection when it sees "logical end of
data". Instead, on the "logical end of data" the contained operator should
flush all of its records. This means that we need to introduce a new event
"logical end of data" and API calls to signal an operator that it should
flush its data and that it should shut down.

Given the available methods, `endInput` could be used for signalling the
"logical end of data" and `dispose` for shutting the operator down. A task
will only shut down and send an "EndOfPartitionEvent" which closes the TCP
connection if all of its inputs have shut down and if it has completed a
final checkpoint.

## Global commits

Now a somewhat related but also orthogonal issue is how to support a global
commit. A global commit is a commit where the external artefacts of a
checkpoint become visible all at once. The global commit should be
supported for streaming as well as batch executions (it is probably even
more important for batch executions). In general, there could be different
ways of implementing the global commit mechanism:

1. Collect global commit handles on the JM and run the global commit action
on the JM
2. Collect global commit handles in a parallelism 1 operator which performs
the global commit action

Approach 2. would probably require to be able to send records from the
snapshotState() method which would be the global commit handles. Both
approaches would have to persist some kind of information in the checkpoint
which allows redoing the global commit operation in case of a failover.
Therefore, for approach 1. it would be required that we send the global
commit handles to the JM from the snapshotState() method and not the
notifyCheckpointComplete().

A related question is in which order to execute the local and global commit
actions:

1. Unspecified order
2. First local commits and then global commits

Option 1. would be easier to implement and might already be good enough for
most sinks.

I would suggest treating final checkpoints and global commits as two
related but separate things. I think it is fine to first try to solve the
final checkpoints problem and then to tackle the global commits. This will
help to decrease the scope of each individual feature.

Cheers,
Till

On Fri, Mar 5, 2021 at 5:12 AM Yun Gao  wrote:

> Hi Piotr,
>
> Very thanks for the suggestions and thoughts!
>
> > Is this a problem? Pipeline would be empty, so EndOfPartitionEvent would
> be traveling very quickly.
>
> No, this is not a problem, sorry I have some wrong thoughts here,
> initially in fact I'm thinking on this issue raised by
> @kezhu:
>
> > Besides this, will FLIP-147 eventually need some ways to decide whether
> an operator need final checkpoint
> @Yun @Guowei ?  @Arvid mentions this in earlier mail.
>
> For this issue itself, I'm still lean towards we might still need it, for
> example, suppose we have a job that
> do not need to commit anything on finished, then it do not need to wait
> for checkpoint at all for normal
> finish case.
>
> > Yes, but aren't we doing it right now anyway?
> `StreamSource#advanceToEndOfEventTime`?
>
> Yes, we indeed have advancedEndOfEventTime for both legacy and new
> sources, sorry for the overlook.
>
> > Is this the plan? That upon recovery we are restarting all operators,
> even those that have already finished?
> Certainly it's one of the possibilities.
>
> For the first version we would tend to use this way since it is easier to
> implement, and we should always need
> to consider the case that tasks are started but operators are finished
> since there might be also tasks with part
> of operators finished. For the long run I think we could continue to
> optimize it via not restart the finished tasks
> at all.
>
> > Keep in mind that those are two separate things, as I mentioned in a
> previous e-mail:
> > > II. When should the `GlobalCommitHandle` be created? Should it be
> returned from `snapshotState()`, `notifyCheckpointComplete()` or somewhere
> else?
> > > III. What should be the ordering guarantee between global commit and
> local commit, if any? Actually the easiest to implement would be undefined,
> but de facto global commit happening before local commits (first invoke > 
> `notifyCheckpointComplete()`
> on the `OperatorCoordinator` and either after or in parallel send
> `notifyCheckpointComplete()` RPCs). As far as I can tell, undefined order
> should work for the use cases that I'm aware of.
> >
> > We could create the `GlobalCommitHandle` in
> `StreamOperator#snapshotState()`, while we could also ensure that
> `notifyCheckpointComplete()` is 

[jira] [Created] (FLINK-22019) UnalignedCheckpointRescaleITCase hangs on azure

2021-03-29 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-22019:


 Summary: UnalignedCheckpointRescaleITCase hangs on azure
 Key: FLINK-22019
 URL: https://issues.apache.org/jira/browse/FLINK-22019
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.13.0
Reporter: Dawid Wysakowicz


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15658=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=5360d54c-8d94-5d85-304e-a89267eb785a=9347



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Glob support on file access

2021-03-29 Thread Etienne Chauchot
But still this workaround would only work when you have access to the 
underlying /FileInputFormat/. For//SQL and Table APIs, you don't so 
you'll be unable to apply this workaround. So what we could do is make a 
PR to support glob at the FileInputFormat level to profit for all APIs.


I'm gonna do it if everyone agrees.

Best

Etienne Chauchot

On 25/03/2021 13:12, Etienne Chauchot wrote:


Hi all,

In case it is useful to some of you:

I have a big batch that needs to use globs (*.parquet for example) to 
read input files. It seems that globs do not work out of the box (see 
https://issues.apache.org/jira/browse/FLINK-6417)


But there is a workaround:


final  FileInputFormat inputFormat =new  FileInputFormat(new  
Path(extractDir(filePath)));/* or any subclass of FileInputFormat*/  /*extact 
parent dir*/
inputFormat.setFilesFilter(new  GlobFilePathFilter(Collections.singletonList(filePath), Collections.emptyList()));/*filePath contains glob, the whole path needs to be provided to 
GlobFilePathFilter*/

inputFormat.setNestedFileEnumeration(true);

Hope, it helps some people

Etienne Chauchot




[jira] [Created] (FLINK-22018) JdbcBatchingOutputFormat should not log SQLException on retry

2021-03-29 Thread Nicolas Deslandes (Jira)
Nicolas Deslandes created FLINK-22018:
-

 Summary: JdbcBatchingOutputFormat should not log SQLException on 
retry
 Key: FLINK-22018
 URL: https://issues.apache.org/jira/browse/FLINK-22018
 Project: Flink
  Issue Type: Bug
  Components: Connectors/ RabbitMQ
Affects Versions: 1.7.2, 1.8.0, 1.8.2
Reporter: Nicolas Deslandes
Assignee: Nicolas Deslandes
 Fix For: 1.8.3, 1.9.2, 1.10.0


RabbitMQ connector do not close consumers and channel on closing

This potentially leaves idle consumer on the queue that prevent any other 
consumer on the same queue to get message, this happens the most when a job is 
stop/cancel and redeploy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22017) Regions may never be scheduled when there are cross-region blocking edges

2021-03-29 Thread Zhilong Hong (Jira)
Zhilong Hong created FLINK-22017:


 Summary: Regions may never be scheduled when there are 
cross-region blocking edges
 Key: FLINK-22017
 URL: https://issues.apache.org/jira/browse/FLINK-22017
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.13.0
Reporter: Zhilong Hong
 Attachments: Illustration.jpg

For the topology with cross-region blocking edges, there are regions that may 
never be scheduled. The case is illustrated in the figure below.

!Illustration.jpg!

Let's denote the vertices with layer_number. It's clear that the edge connects 
v2_2 and v3_2 crosses region 1 and region 2. Since region 1 has no blocking 
edges connected to other regions, it will be scheduled first. When vertex2_2 is 
finished, PipelinedRegionSchedulingStrategy will trigger 
{{onExecutionStateChange}} for it.

As expected, region 2 will be scheduled since all its consumer partitions are 
consumable. But in fact region 2 won't be scheduled, because the result 
partition of vertex2_2 is not tagged as consumable. Whether it is consumable or 
not is determined by its IntermediateDataSet.

However, an IntermediateDataSet is consumable if and only if all the producers 
of its IntermediateResultPartitions are finished. This IntermediateDataSet will 
never be consumable since vertex2_3 is not scheduled. All in all, this forms a 
deadlock that a region will never be scheduled because it's not scheduled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22016) PushFilterIntoLegacyTableSourceScanRule fails to deal with NULLs

2021-03-29 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-22016:
---

 Summary: PushFilterIntoLegacyTableSourceScanRule fails to deal 
with NULLs
 Key: FLINK-22016
 URL: https://issues.apache.org/jira/browse/FLINK-22016
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.13.0
Reporter: Caizhi Weng
 Fix For: 1.13.0


Add the following test case to {{PushFilterIntoLegacyTableSourceScanRuleTest}} 
to reproduce this bug:

{code:scala}
@Test
def myTest(): Unit = {
  val schema = TableSchema
.builder()
.field("a", DataTypes.STRING)
.field("b", DataTypes.STRING)
.build()

  val data = List(Row.of("foo", "bar"))
  TestLegacyFilterableTableSource.createTemporaryTable(
util.tableEnv,
schema,
"MTable",
isBounded = true,
data,
List("a", "b"))

  util.verifyRelPlan(
"""
  |WITH MView AS (SELECT CASE
  |  WHEN a = b THEN a
  |  ELSE CAST(NULL AS STRING)
  |  END AS a
  |  FROM MTable)
  |SELECT a FROM MView WHERE a IS NOT NULL
  |""".stripMargin)
}
{code}

The exception stack is
{code}
org.apache.flink.table.api.ValidationException: Data type 'STRING NOT NULL' 
does not support null values.

at 
org.apache.flink.table.expressions.ValueLiteralExpression.validateValueDataType(ValueLiteralExpression.java:272)
at 
org.apache.flink.table.expressions.ValueLiteralExpression.(ValueLiteralExpression.java:79)
at 
org.apache.flink.table.expressions.ApiExpressionUtils.valueLiteral(ApiExpressionUtils.java:251)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter.visitLiteral(RexNodeExtractor.scala:451)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter.visitLiteral(RexNodeExtractor.scala:359)
at org.apache.calcite.rex.RexLiteral.accept(RexLiteral.java:1173)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter$$anonfun$8.apply(RexNodeExtractor.scala:459)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter$$anonfun$8.apply(RexNodeExtractor.scala:459)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter.visitCall(RexNodeExtractor.scala:458)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter.visitCall(RexNodeExtractor.scala:359)
at org.apache.calcite.rex.RexCall.accept(RexCall.java:174)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter$$anonfun$8.apply(RexNodeExtractor.scala:459)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter$$anonfun$8.apply(RexNodeExtractor.scala:459)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter.visitCall(RexNodeExtractor.scala:458)
at 
org.apache.flink.table.planner.plan.utils.RexNodeToExpressionConverter.visitCall(RexNodeExtractor.scala:359)
at org.apache.calcite.rex.RexCall.accept(RexCall.java:174)
at 
org.apache.flink.table.planner.plan.utils.RexNodeExtractor$$anonfun$extractConjunctiveConditions$1.apply(RexNodeExtractor.scala:136)
at 
org.apache.flink.table.planner.plan.utils.RexNodeExtractor$$anonfun$extractConjunctiveConditions$1.apply(RexNodeExtractor.scala:135)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at 

[jira] [Created] (FLINK-22015) SQL filter containing OR and IS NULL will produce an incorrect result.

2021-03-29 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-22015:
---

 Summary: SQL filter containing OR and IS NULL will produce an 
incorrect result.
 Key: FLINK-22015
 URL: https://issues.apache.org/jira/browse/FLINK-22015
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.13.0
Reporter: Caizhi Weng
 Fix For: 1.13.0


Add the following test case to {{CalcITCase}} to reproduce this bug.

{code:scala}
@Test
def myTest(): Unit = {
  checkResult(
"""
  |WITH myView AS (SELECT a, CASE
  |  WHEN a = 1 THEN '1'
  |  ELSE CAST(NULL AS STRING)
  |  END AS s
  |FROM SmallTable3)
  |SELECT a FROM myView WHERE s = '2' OR s IS NULL
  |""".stripMargin,
Seq(row(2), row(3)))
}
{code}

However if we remove the {{s = '2'}} the result will be correct.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22014) Flink JobManager failed to restart after failure in kubernetes HA setup

2021-03-29 Thread Mikalai Lushchytski (Jira)
Mikalai Lushchytski created FLINK-22014:
---

 Summary: Flink JobManager failed to restart after failure in 
kubernetes HA setup
 Key: FLINK-22014
 URL: https://issues.apache.org/jira/browse/FLINK-22014
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Kubernetes
Affects Versions: 1.12.2
Reporter: Mikalai Lushchytski


After the JobManager pod failed and the new one started, it was not able to 
recover jobs due to the absence of recovery data in storage - config map 
pointed at not existing file.
 
Due to this the JobManager pod entered into the `CrashLoopBackOff`state and was 
not able to recover - each attempt failed with the same error so the whole 
cluster became unrecoverable and not operating.
 
I had to manually delete the config map and start the jobs again without the 
save point.
 
If I tried to emulate the failure further by deleting job manager pod manually, 
the new pod every time recovered well and issue was not reproducible anymore 
artificially.
 
Below is the failure log:

```
2021-03-26 08:22:57,925 INFO 
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl [] - 
Starting the SlotManager.
2021-03-26 08:22:57,928 INFO 
org.apache.flink.runtime.leaderretrieval.DefaultLeaderRetrievalService [] - 
Starting DefaultLeaderRetrievalService with 
KubernetesLeaderRetrievalDriver{configMapName='stellar-flink-cluster-dispatcher-leader'}.
2021-03-26 08:22:57,931 INFO 
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Retrieved job ids 
[198c46bac791e73ebcc565a550fa4ff6, 344f5ebc1b5c3a566b4b2837813e4940, 
96c4603a0822d10884f7fe536703d811, d9ded24224aab7c7041420b3efc1b6ba] from 
KubernetesStateHandleStore{configMapName='stellar-flink-cluster-dispatcher-leader'}
2021-03-26 08:22:57,933 INFO 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] - 
Trying to recover job with job id 198c46bac791e73ebcc565a550fa4ff6.
2021-03-26 08:22:58,029 INFO 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] - 
Stopping SessionDispatcherLeaderProcess.
2021-03-26 08:28:22,677 INFO 
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Stopping 
DefaultJobGraphStore. 2021-03-26 08:28:22,681 ERROR 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Fatal error occurred 
in the cluster entrypoint. java.util.concurrent.CompletionException: 
org.apache.flink.util.FlinkRuntimeException: Could not recover job with job id 
198c46bac791e73ebcc565a550fa4ff6. at 
java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source) ~[?:?] 
at java.util.concurrent.CompletableFuture.completeThrowable(Unknown Source) 
[?:?] at java.util.concurrent.CompletableFuture$AsyncSupply.run(Unknown Source) 
[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
[?:?] at java.lang.Thread.run(Unknown Source) [?:?] Caused by: 
org.apache.flink.util.FlinkRuntimeException: Could not recover job with job id 
198c46bac791e73ebcc565a550fa4ff6. at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.recoverJob(SessionDispatcherLeaderProcess.java:144)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.recoverJobs(SessionDispatcherLeaderProcess.java:122)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] at 
org.apache.flink.runtime.dispatcher.runner.AbstractDispatcherLeaderProcess.supplyUnsynchronizedIfRunning(AbstractDispatcherLeaderProcess.java:198)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.recoverJobsIfRunning(SessionDispatcherLeaderProcess.java:113)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] ... 4 more Caused by: 
org.apache.flink.util.FlinkException: Could not retrieve submitted JobGraph 
from state handle under jobGraph-198c46bac791e73ebcc565a550fa4ff6. This 
indicates that the retrieved state handle is broken. Try cleaning the state 
handle store. at 
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore.recoverJobGraph(DefaultJobGraphStore.java:171)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.recoverJob(SessionDispatcherLeaderProcess.java:141)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.recoverJobs(SessionDispatcherLeaderProcess.java:122)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] at 
org.apache.flink.runtime.dispatcher.runner.AbstractDispatcherLeaderProcess.supplyUnsynchronizedIfRunning(AbstractDispatcherLeaderProcess.java:198)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.recoverJobsIfRunning(SessionDispatcherLeaderProcess.java:113)
 ~[flink-dist_2.12-1.12.2.jar:1.12.2] 

[GitHub] [flink-ml] becketqin commented on pull request #1: [Flink-21976] Move Flink ML pipeline API and library code from apache/flink to apache/flink-ml

2021-03-29 Thread GitBox


becketqin commented on pull request #1:
URL: https://github.com/apache/flink-ml/pull/1#issuecomment-809247359


   @lindong28 Thanks for the patch. Merged to master.
   08d058046f34b711128e0646ffbdc7e384c22064
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Re: [DISCUSS] Feature freeze date for 1.13

2021-03-29 Thread Till Rohrmann
+1 for the 31st of March for the feature freeze.

Cheers,
Till

On Mon, Mar 29, 2021 at 10:12 AM Robert Metzger  wrote:

> +1 for March 31st for the feature freeze.
>
>
>
> On Fri, Mar 26, 2021 at 3:39 PM Dawid Wysakowicz 
> wrote:
>
> > Thank you Thomas! I'll definitely check the issue you linked.
> >
> > Best,
> >
> > Dawid
> >
> > On 23/03/2021 20:35, Thomas Weise wrote:
> > > Hi Dawid,
> > >
> > > Thanks for the heads up.
> > >
> > > Regarding the "Rebase and merge" button. I find that merge option
> useful,
> > > especially for small simple changes and for backports. The following
> > should
> > > help to safeguard from the issue encountered previously:
> > > https://github.com/jazzband/pip-tools/issues/1085
> > >
> > > Thanks,
> > > Thomas
> > >
> > >
> > > On Tue, Mar 23, 2021 at 4:58 AM Dawid Wysakowicz <
> dwysakow...@apache.org
> > >
> > > wrote:
> > >
> > >> Hi devs, users!
> > >>
> > >> 1. *Feature freeze date*
> > >>
> > >> We are approaching the end of March which we agreed would be the time
> > for
> > >> a Feature Freeze. From the knowledge I've gather so far it still seems
> > to
> > >> be a viable plan. I think it is a good time to agree on a particular
> > date,
> > >> when it should happen. We suggest *(end of day CEST) March 31st*
> > >> (Wednesday next week) as the feature freeze time.
> > >>
> > >> Similarly as last time, we want to create RC0 on the day after the
> > feature
> > >> freeze, to make sure the RC creation process is running smoothly, and
> to
> > >> have a common testing reference point.
> > >>
> > >> Having said that let us remind after Robert & Dian from the previous
> > >> release what it a Feature Freeze means:
> > >>
> > >> *B) What does feature freeze mean?*After the feature freeze, no new
> > >> features are allowed to be merged to master. Only bug fixes and
> > >> documentation improvements.
> > >> The release managers will revert new feature commits after the feature
> > >> freeze.
> > >> Rational: The goal of the feature freeze phase is to improve the
> system
> > >> stability by addressing known bugs. New features tend to introduce new
> > >> instabilities, which would prolong the release process.
> > >> If you need to merge a new feature after the freeze, please open a
> > >> discussion on the dev@ list. If there are no objections by a PMC
> member
> > >> within 48 (workday)hours, the feature can be merged.
> > >>
> > >> 2. *Merge PRs from the command line*
> > >>
> > >> In the past releases it was quite frequent around the Feature Freeze
> > date
> > >> that we ended up with a broken main branch that either did not compile
> > or
> > >> there were failing tests. It was often due to concurrent merges to the
> > main
> > >> branch via the "Rebase and merge" button. To overcome the problem we
> > would
> > >> like to suggest only ever merging PRs from a command line. Thank you
> > >> Stephan for the idea! The suggested workflow would look as follows:
> > >>
> > >>1. Pull the change and rebase on the current main branch
> > >>2. Build the project (e.g. from IDE, which should be faster than
> > >>building entire project from cmd) -> this should ensure the project
> > compiles
> > >>3. Run the tests in the module that the change affects -> this
> should
> > >>greatly minimize the chances of failling tests
> > >>4. Push the change to the main branch
> > >>
> > >> Let us know what you think!
> > >>
> > >> Best,
> > >>
> > >> Guowei & Dawid
> > >>
> > >>
> > >>
> >
> >
>


[jira] [Created] (FLINK-22013) Add CI to run tests in Azure DevOps pipeline for every flink-ml pull request

2021-03-29 Thread Dong Lin (Jira)
Dong Lin created FLINK-22013:


 Summary: Add CI to run tests in Azure DevOps pipeline for every 
flink-ml pull request
 Key: FLINK-22013
 URL: https://issues.apache.org/jira/browse/FLINK-22013
 Project: Flink
  Issue Type: Improvement
Reporter: Dong Lin


We should add CI to run tests in Azure DevOps pipeline for every flink-ml pull 
requestl.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22012) Add nightly-build pipeline to test flink-ml repo with the latest flink repo snapshot

2021-03-29 Thread Dong Lin (Jira)
Dong Lin created FLINK-22012:


 Summary: Add nightly-build pipeline to test flink-ml repo with the 
latest flink repo snapshot
 Key: FLINK-22012
 URL: https://issues.apache.org/jira/browse/FLINK-22012
 Project: Flink
  Issue Type: Improvement
Reporter: Dong Lin


We should add nightly-build pipeline to test flink-ml repo with the latest 
Flink repo snapshot so that, if there is any issue in the Flink repo that 
breaks the flink-ml repo, we can catch the issue within 24 hours.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22011) Support local global optimization for window aggregation in runtime

2021-03-29 Thread Jark Wu (Jira)
Jark Wu created FLINK-22011:
---

 Summary: Support local global optimization for window aggregation 
in runtime
 Key: FLINK-22011
 URL: https://issues.apache.org/jira/browse/FLINK-22011
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Runtime
Reporter: Jark Wu
Assignee: Jark Wu






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[RESULT] [VOTE] Apache Flink Jira Process (& Bot)

2021-03-29 Thread Konstantin Knauf
Hi everyone,

I am happy to announce that the proposal has been accepted. We received
five +1 votes, four of which are binding, and no vetoes:

Till (binding)
Roman (binding)
Arvid (binding)
Robert (binding)
Matthias (non-binding)

I will share an implementation proposal soon.

Thanks,

Konstantin

-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk


Re: [DISCUSS] Feature freeze date for 1.13

2021-03-29 Thread Robert Metzger
+1 for March 31st for the feature freeze.



On Fri, Mar 26, 2021 at 3:39 PM Dawid Wysakowicz 
wrote:

> Thank you Thomas! I'll definitely check the issue you linked.
>
> Best,
>
> Dawid
>
> On 23/03/2021 20:35, Thomas Weise wrote:
> > Hi Dawid,
> >
> > Thanks for the heads up.
> >
> > Regarding the "Rebase and merge" button. I find that merge option useful,
> > especially for small simple changes and for backports. The following
> should
> > help to safeguard from the issue encountered previously:
> > https://github.com/jazzband/pip-tools/issues/1085
> >
> > Thanks,
> > Thomas
> >
> >
> > On Tue, Mar 23, 2021 at 4:58 AM Dawid Wysakowicz  >
> > wrote:
> >
> >> Hi devs, users!
> >>
> >> 1. *Feature freeze date*
> >>
> >> We are approaching the end of March which we agreed would be the time
> for
> >> a Feature Freeze. From the knowledge I've gather so far it still seems
> to
> >> be a viable plan. I think it is a good time to agree on a particular
> date,
> >> when it should happen. We suggest *(end of day CEST) March 31st*
> >> (Wednesday next week) as the feature freeze time.
> >>
> >> Similarly as last time, we want to create RC0 on the day after the
> feature
> >> freeze, to make sure the RC creation process is running smoothly, and to
> >> have a common testing reference point.
> >>
> >> Having said that let us remind after Robert & Dian from the previous
> >> release what it a Feature Freeze means:
> >>
> >> *B) What does feature freeze mean?*After the feature freeze, no new
> >> features are allowed to be merged to master. Only bug fixes and
> >> documentation improvements.
> >> The release managers will revert new feature commits after the feature
> >> freeze.
> >> Rational: The goal of the feature freeze phase is to improve the system
> >> stability by addressing known bugs. New features tend to introduce new
> >> instabilities, which would prolong the release process.
> >> If you need to merge a new feature after the freeze, please open a
> >> discussion on the dev@ list. If there are no objections by a PMC member
> >> within 48 (workday)hours, the feature can be merged.
> >>
> >> 2. *Merge PRs from the command line*
> >>
> >> In the past releases it was quite frequent around the Feature Freeze
> date
> >> that we ended up with a broken main branch that either did not compile
> or
> >> there were failing tests. It was often due to concurrent merges to the
> main
> >> branch via the "Rebase and merge" button. To overcome the problem we
> would
> >> like to suggest only ever merging PRs from a command line. Thank you
> >> Stephan for the idea! The suggested workflow would look as follows:
> >>
> >>1. Pull the change and rebase on the current main branch
> >>2. Build the project (e.g. from IDE, which should be faster than
> >>building entire project from cmd) -> this should ensure the project
> compiles
> >>3. Run the tests in the module that the change affects -> this should
> >>greatly minimize the chances of failling tests
> >>4. Push the change to the main branch
> >>
> >> Let us know what you think!
> >>
> >> Best,
> >>
> >> Guowei & Dawid
> >>
> >>
> >>
>
>


[jira] [Created] (FLINK-22010) when flink executed union all opeators,exception occured

2021-03-29 Thread zhou (Jira)
zhou created FLINK-22010:


 Summary: when flink executed union all opeators,exception occured
 Key: FLINK-22010
 URL: https://issues.apache.org/jira/browse/FLINK-22010
 Project: Flink
  Issue Type: Bug
  Components: API / Core
Affects Versions: 1.12.1
Reporter: zhou


*when I executed job on 1.11.2,the job no exception,when I executed job on 
1.12.1 or 1.12.2 ,the job occured some exception.*

*code as the following:*
{quote}result = result1.union_all(result2)
result = result.union_all(result3)
# 
.union_all(result4).union_all(result5).union_all(result5).union_all(result6).union_all(result7)
result.execute().print()
{quote}
above the code, when i comment the code as the following,the code also no 
exception on flink 1.12.1 :
{quote}result = result1.union_all(result2)
#result = result.union_all(result3)
# 
.union_all(result4).union_all(result5).union_all(result5).union_all(result6).union_all(result7)
result.execute().print()
{quote}
I dont know how to solve the problems, May be someone could help me?

Excepion as the following:
{quote}py4j.protocol.Py4JJavaError: An error occurred while calling o340.print.
: java.lang.RuntimeException: Failed to fetch next result
 at 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:109)
 at 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:80)
 at 
org.apache.flink.table.planner.sinks.SelectTableSinkBase$RowIteratorWrapper.hasNext(SelectTableSinkBase.java:117)
 at 
org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:350)
 at 
org.apache.flink.table.utils.PrintUtils.printAsTableauForm(PrintUtils.java:149)
 at 
org.apache.flink.table.api.internal.TableResultImpl.print(TableResultImpl.java:154)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
 at 
org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
 at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282)
 at 
org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
 at 
org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
 at 
org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
 at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Failed to fetch job execution result
 at 
org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.getAccumulatorResults(CollectResultFetcher.java:169)
 at 
org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:118)
 at 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:106)
 ... 16 more
Caused by: java.util.concurrent.ExecutionException: 
org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: 
9ba6325f27c97192e42e76bd52d05db8)
 at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
 at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
 at 
org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.getAccumulatorResults(CollectResultFetcher.java:167)
 ... 18 more
Caused by: org.apache.flink.client.program.ProgramInvocationException: Job 
failed (JobID: 9ba6325f27c97192e42e76bd52d05db8)
 at 
org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$null$6(ClusterClientJobClientAdapter.java:125)
 at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
 at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
 at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
 at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
 at 
org.apache.flink.client.program.rest.RestClusterClient.lambda$pollResourceAsync$22(RestClusterClient.java:665)
 at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
 at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
 at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
 at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
 at 
org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:394)
 at 

[jira] [Created] (FLINK-22009) when data type is map,we cannot union

2021-03-29 Thread ying zhang (Jira)
ying zhang created FLINK-22009:
--

 Summary: when data type is map,we cannot union
 Key: FLINK-22009
 URL: https://issues.apache.org/jira/browse/FLINK-22009
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.12.0
Reporter: ying zhang
 Attachments: image-2021-03-29-15-45-48-627.png

I create 2 tables which contains data type map,but i flind i cannot union two 
tables,is there any solutions to solve this problem?

!image-2021-03-29-15-45-48-627.png!

 

 

 

 

this is the create table statements:

CREATE TABLE `map_string_string1`(
 `query` string,
 `wid` string,
 `index` int,
 `page` string,
 `hc_cid1` string,
 `hc_cid2` string,
 `hc_cid3` string,
 `cid1` string,
 `cid2` string,
 `cid3` string,
 `ts` bigint,
 `number_feature` map)
PARTITIONED BY (
 `dt` string)
ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '\t'
 LINES TERMINATED BY '\n'
STORED AS INPUTFORMAT
 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'

 

 

CREATE TABLE `map_string_string2`(
 `query` string,
 `wid` string,
 `index` int,
 `page` string,
 `hc_cid1` string,
 `hc_cid2` string,
 `hc_cid3` string,
 `cid1` string,
 `cid2` string,
 `cid3` string,
 `ts` bigint,
 `number_feature` map)
PARTITIONED BY (
 `dt` string)
ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '\t'
 LINES TERMINATED BY '\n'
STORED AS INPUTFORMAT
 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22008) writing metadata is not an atomic operation, we should add a commit logic

2021-03-29 Thread xiaogang zhou (Jira)
xiaogang zhou created FLINK-22008:
-

 Summary: writing metadata is not an atomic operation, we should 
add a commit logic
 Key: FLINK-22008
 URL: https://issues.apache.org/jira/browse/FLINK-22008
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing
Affects Versions: 1.12.2
Reporter: xiaogang zhou


writing metadata is not an atomic operation, some logic can cause there is a 
metadata file in the checkpoint dir, but the data is corrupted if the 
jobmanager crash while writing the metadata. 

 

So we should consider to add commit operation in the checkpoint storage location



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink-ml] lindong28 commented on pull request #1: [Flink-21976] Move Flink ML pipeline API and library code from apache/flink to apache/flink-ml

2021-03-29 Thread GitBox


lindong28 commented on pull request #1:
URL: https://github.com/apache/flink-ml/pull/1#issuecomment-809144123


   @becketqin Could you review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink-ml] lindong28 opened a new pull request #1: [Flink-21976] Move Flink ML pipeline API and library code from apache/flink to apache/flink-ml

2021-03-29 Thread GitBox


lindong28 opened a new pull request #1:
URL: https://github.com/apache/flink-ml/pull/1


   ## What is the purpose of the change
   
   Move Flink ML pipeline API and library code from apache/flink to 
apache/flink-ml
   
   ## Brief change log
   
   - Move files under flink/flink-ml-parent to flink-ml repo
   - Add CODE_OF_CONDUCT.md, LICENSE and .gitignore
   - Add files needed for checkstyle under tools/maven
   - Update pom.xml to include plugins from apache/flink/pom.xml that are 
needed to build and release this flink-ml repo.
   
   ## Verifying this change
   
   1) This PR could run and pass all unit tests.
   2) Run `mvn install` in both `apache/flink-ml` and 
`apache/flink/flink-ml-parent` and verify that they generate the same set of 
files (e.g. *.pom files and *.jar files) at the same path under 
`~/.m2/repository/org/apache/flink/`
   3) `mvn install` generates `flink-ml-api-1.13-SNAPSHOT.jar`, 
`flink-ml-lib_2.11-1.13-SNAPSHOT.jar` and 
`flink-ml-uber_2.11-1.13-SNAPSHOT.jar`. I used IntellIj to compare the jar 
files with those jar files generated by `apache/flink/flink-ml-parent` and 
verified that they contains the same class files.
   
   The only difference in the jar files is that the jar file from this repo has 
NOTICE that says "Copyright 2019-2020 The Apache Software Foundation", whereas 
the jar file from apache/flink has NOTICE that says "Copyright 2014-2021 The 
Apache Software Foundation". That is not clear to me where that difference 
comes from. I believe this is not a blocking issue.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Re: [BULK]Re: [SURVEY] Remove Mesos support

2021-03-29 Thread Robert Metzger
+1



On Mon, Mar 29, 2021 at 5:44 AM Yangze Guo  wrote:

> +1
>
> Best,
> Yangze Guo
>
> On Mon, Mar 29, 2021 at 11:31 AM Xintong Song 
> wrote:
> >
> > +1
> > It's already a matter of fact for a while that we no longer port new
> features to the Mesos deployment.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Fri, Mar 26, 2021 at 10:37 PM Till Rohrmann 
> wrote:
> >>
> >> +1 for officially deprecating this component for the 1.13 release.
> >>
> >> Cheers,
> >> Till
> >>
> >> On Thu, Mar 25, 2021 at 1:49 PM Konstantin Knauf 
> wrote:
> >>>
> >>> Hi Matthias,
> >>>
> >>> Thank you for following up on this. +1 to officially deprecate Mesos
> in the code and documentation, too. It will be confusing for users if this
> diverges from the roadmap.
> >>>
> >>> Cheers,
> >>>
> >>> Konstantin
> >>>
> >>> On Thu, Mar 25, 2021 at 12:23 PM Matthias Pohl 
> wrote:
> 
>  Hi everyone,
>  considering the upcoming release of Flink 1.13, I wanted to revive the
>  discussion about the Mesos support ones more. Mesos is also already
> listed
>  as deprecated in Flink's overall roadmap [1]. Maybe, it's time to
> align the
>  documentation accordingly to make it more explicit?
> 
>  What do you think?
> 
>  Best,
>  Matthias
> 
>  [1] https://flink.apache.org/roadmap.html#feature-radar
> 
>  On Wed, Oct 28, 2020 at 9:40 AM Till Rohrmann 
> wrote:
> 
>  > Hi Oleksandr,
>  >
>  > yes you are right. The biggest problem is at the moment the lack of
> test
>  > coverage and thereby confidence to make changes. We have some e2e
> tests
>  > which you can find here [1]. These tests are, however, quite coarse
> grained
>  > and are missing a lot of cases. One idea would be to add a Mesos
> e2e test
>  > based on Flink's end-to-end test framework [2]. I think what needs
> to be
>  > done there is to add a Mesos resource and a way to submit jobs to a
> Mesos
>  > cluster to write e2e tests.
>  >
>  > [1] https://github.com/apache/flink/tree/master/flink-jepsen
>  > [2]
>  >
> https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common
>  >
>  > Cheers,
>  > Till
>  >
>  > On Tue, Oct 27, 2020 at 12:29 PM Oleksandr Nitavskyi <
>  > o.nitavs...@criteo.com> wrote:
>  >
>  >> Hello Xintong,
>  >>
>  >> Thanks for the insights and support.
>  >>
>  >> Browsing the Mesos backlog and didn't identify anything critical,
> which
>  >> is left there.
>  >>
>  >> I see that there are were quite a lot of contributions to the
> Flink Mesos
>  >> in the recent version:
>  >> https://github.com/apache/flink/commits/master/flink-mesos.
>  >> We plan to validate the current Flink master (or release 1.12
> branch) our
>  >> Mesos setup. In case of any issues, we will try to propose changes.
>  >> My feeling is that our test results shouldn't affect the Flink 1.12
>  >> release cycle. And if any potential commits will land into the
> 1.12.1 it
>  >> should be totally fine.
>  >>
>  >> In the future, we would be glad to help you guys with any
>  >> maintenance-related questions. One of the highest priorities
> around this
>  >> component seems to be the development of the full e2e test.
>  >>
>  >> Kind Regards
>  >> Oleksandr Nitavskyi
>  >> 
>  >> From: Xintong Song 
>  >> Sent: Tuesday, October 27, 2020 7:14 AM
>  >> To: dev ; user 
>  >> Cc: Piyush Narang 
>  >> Subject: [BULK]Re: [SURVEY] Remove Mesos support
>  >>
>  >> Hi Piyush,
>  >>
>  >> Thanks a lot for sharing the information. It would be a great
> relief that
>  >> you are good with Flink on Mesos as is.
>  >>
>  >> As for the jira issues, I believe the most essential ones should
> have
>  >> already been resolved. You may find some remaining open issues
> here [1],
>  >> but not all of them are necessary if we decide to keep Flink on
> Mesos as is.
>  >>
>  >> At the moment and in the short future, I think helps are mostly
> needed on
>  >> testing the upcoming release 1.12 with Mesos use cases. The
> community is
>  >> currently actively preparing the new release, and hopefully we
> could come
>  >> up with a release candidate early next month. It would be greatly
>  >> appreciated if you fork as experienced Flink on Mesos users can
> help with
>  >> verifying the release candidates.
>  >>
>  >>
>  >> Thank you~
>  >>
>  >> Xintong Song
>  >>
>  >> [1]
>  >>
> https://issues.apache.org/jira/browse/FLINK-17402?jql=project%20%3D%20FLINK%20AND%20component%20%3D%20%22Deployment%20%2F%20Mesos%22%20AND%20status%20%3D%20Open
>  >> <
>  >>
> 

[jira] [Created] (FLINK-22007) PartitionReleaseInBatchJobBenchmarkExecutor seems to be failing

2021-03-29 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-22007:
--

 Summary: PartitionReleaseInBatchJobBenchmarkExecutor seems to be 
failing
 Key: FLINK-22007
 URL: https://issues.apache.org/jira/browse/FLINK-22007
 Project: Flink
  Issue Type: Bug
  Components: Benchmarks, Runtime / Coordination
Affects Versions: 1.13.0
Reporter: Piotr Nowojski
 Fix For: 1.13.0


Travis CI is failing:
https://travis-ci.com/github/apache/flink-benchmarks/builds/221290042

While there is also some problem with the Jenkins builds for the same benchmark.
http://codespeed.dak8s.net:8080/job/flink-scheduler-benchmarks/232

It would be also interesting for the future to understand why the Jenkins build 
is green and try to fix it (ideally, if some benchmarks fail, partial results 
should be still uploaded but the Jenkins build should be marked as failed). 
Otherwise issues like that can remain unnoticed for quite a bit of time.

CC [~Thesharing] [~zhuzh]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22006) Could not run more than 20 jobs in a native K8s session with K8s HA enabled

2021-03-29 Thread Yang Wang (Jira)
Yang Wang created FLINK-22006:
-

 Summary: Could not run more than 20 jobs in a native K8s session 
with K8s HA enabled
 Key: FLINK-22006
 URL: https://issues.apache.org/jira/browse/FLINK-22006
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.12.2, 1.13.0
Reporter: Yang Wang
 Attachments: image-2021-03-24-18-08-42-116.png

Currently, if we start a native K8s session cluster with K8s HA enabled, we 
could not run more than 20 streaming jobs. 

 

The latest job is always initializing, and the previous one is created and 
waiting to be assigned. It seems that some internal resources have been 
exhausted, e.g. okhttp thread pool , tcp connections or something else.

!image-2021-03-24-18-08-42-116.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)