Re: Beam Website Feedback

2022-10-27 Thread Brian Hulette via dev
I proposed https://github.com/apache/beam/pull/23877 to address this.

On Thu, Oct 27, 2022 at 2:12 PM Sachin Agarwal  wrote:

> No objections here.  The latter (the surviving) is the one linked in
> the top navigation bar and has the x-lang details that help.
>
> On Thu, Oct 27, 2022 at 2:09 PM Brian Hulette  wrote:
>
>> Hm, it seems like we need to drop
>> https://beam.apache.org/documentation/io/built-in/ as it's been
>> superseded by https://beam.apache.org/documentation/io/connectors/
>>
>> Would there be any objections to that?
>>
>> On Thu, Oct 27, 2022 at 2:04 PM Sachin Agarwal via dev <
>> dev@beam.apache.org> wrote:
>>
>>> JDBCIO is available as a Java-based IO.  It is also listed on
>>> https://beam.apache.org/documentation/io/connectors/
>>>
>>> On Thu, Oct 27, 2022 at 2:01 PM Charles Kangai <
>>> char...@charleskangai.co.uk> wrote:
>>>
 What about jdbc?
 I want to use Beam to read/write to/from a relational database, e.g.
 Oracle or Microsoft SQL Server.
 I don’t see a connector on your page:
 *https://beam.apache.org/documentation/io/built-in*
 

 Thanks,
 Charles Kangai



>>>


[Proposal] Beam MultimapState API

2022-10-27 Thread 郑卜千
Hi all,

I've been working on adding MultimapState support to the Dataflow Runner,
and the state interface is currently missing from the Beam State API.

I have an one pager proposing its API interface in
https://docs.google.com/document/d/1zm16QCxWEWNy4qW1KKoA3DmQBOLrTSPimXiqQkTxokc/edit#.
Please share suggestions/comments!

Thanks!
Buqian Zheng


Re: Beam Website Feedback

2022-10-27 Thread Sachin Agarwal via dev
No objections here.  The latter (the surviving) is the one linked in
the top navigation bar and has the x-lang details that help.

On Thu, Oct 27, 2022 at 2:09 PM Brian Hulette  wrote:

> Hm, it seems like we need to drop
> https://beam.apache.org/documentation/io/built-in/ as it's been
> superseded by https://beam.apache.org/documentation/io/connectors/
>
> Would there be any objections to that?
>
> On Thu, Oct 27, 2022 at 2:04 PM Sachin Agarwal via dev <
> dev@beam.apache.org> wrote:
>
>> JDBCIO is available as a Java-based IO.  It is also listed on
>> https://beam.apache.org/documentation/io/connectors/
>>
>> On Thu, Oct 27, 2022 at 2:01 PM Charles Kangai <
>> char...@charleskangai.co.uk> wrote:
>>
>>> What about jdbc?
>>> I want to use Beam to read/write to/from a relational database, e.g.
>>> Oracle or Microsoft SQL Server.
>>> I don’t see a connector on your page:
>>> *https://beam.apache.org/documentation/io/built-in*
>>> 
>>>
>>> Thanks,
>>> Charles Kangai
>>>
>>>
>>>
>>


Re: Beam Website Feedback

2022-10-27 Thread Brian Hulette via dev
Hm, it seems like we need to drop
https://beam.apache.org/documentation/io/built-in/ as it's been superseded
by https://beam.apache.org/documentation/io/connectors/

Would there be any objections to that?

On Thu, Oct 27, 2022 at 2:04 PM Sachin Agarwal via dev 
wrote:

> JDBCIO is available as a Java-based IO.  It is also listed on
> https://beam.apache.org/documentation/io/connectors/
>
> On Thu, Oct 27, 2022 at 2:01 PM Charles Kangai <
> char...@charleskangai.co.uk> wrote:
>
>> What about jdbc?
>> I want to use Beam to read/write to/from a relational database, e.g.
>> Oracle or Microsoft SQL Server.
>> I don’t see a connector on your page:
>> *https://beam.apache.org/documentation/io/built-in*
>> 
>>
>> Thanks,
>> Charles Kangai
>>
>>
>>
>


Re: Beam Website Feedback

2022-10-27 Thread Sachin Agarwal via dev
JDBCIO is available as a Java-based IO.  It is also listed on
https://beam.apache.org/documentation/io/connectors/

On Thu, Oct 27, 2022 at 2:01 PM Charles Kangai 
wrote:

> What about jdbc?
> I want to use Beam to read/write to/from a relational database, e.g.
> Oracle or Microsoft SQL Server.
> I don’t see a connector on your page:
> *https://beam.apache.org/documentation/io/built-in*
> 
>
> Thanks,
> Charles Kangai
>
>
>


Beam Website Feedback

2022-10-27 Thread Charles Kangai
What about jdbc?
I want to use Beam to read/write to/from a relational database, e.g. Oracle or 
Microsoft SQL Server.
I don't see a connector on your page: 
https://beam.apache.org/documentation/io/built-in

Thanks,
Charles Kangai




Re: Beam starter projects dependency updates

2022-10-27 Thread Brian Hulette via dev
Could we just use the same set of reviewers as pr-bot in the main repo [1]?
I don't think that we could avoid duplicating the data though.

[1]
https://github.com/apache/beam/blob/728e8ecc8a40d3d578ada7773b77eca2b3c68d03/.github/REVIEWERS.yml

On Thu, Oct 27, 2022 at 12:20 PM David Cavazos via dev 
wrote:

> Hi everyone!
>
> We want to make sure the Beam starter projects always come with the latest
> (compatible) versions for every dependency. I enabled Dependabot on all of
> them to automate this as much as possible, and we have automated tests to
> make sure everything works as expected.
>
> However, we still need someone to merge Dependabot's PRs. The good news is
> that since the starter projects are so simple, if tests pass they're most
> likely safe to merge, and tests only take a couple minutes to run.
>
> We could either batch update all dependencies as part of the release
> process, or have people check them periodically (like an owner per
> language).
>
> These are all the repos we have to keep an eye to:
>
>- https://github.com/apache/beam-starter-java -- 9 updates, all tests
>passing
>- https://github.com/apache/beam-starter-python -- 2 updates, all
>tests passing
>- https://github.com/apache/beam-starter-go -- 0 updates
>- https://github.com/apache/beam-starter-kotlin -- 3 updates, all
>tests passing
>- https://github.com/apache/beam-starter-scala -- not done yet, but
>keep an eye
>
>


Beam starter projects dependency updates

2022-10-27 Thread David Cavazos via dev
Hi everyone!

We want to make sure the Beam starter projects always come with the latest
(compatible) versions for every dependency. I enabled Dependabot on all of
them to automate this as much as possible, and we have automated tests to
make sure everything works as expected.

However, we still need someone to merge Dependabot's PRs. The good news is
that since the starter projects are so simple, if tests pass they're most
likely safe to merge, and tests only take a couple minutes to run.

We could either batch update all dependencies as part of the release
process, or have people check them periodically (like an owner per
language).

These are all the repos we have to keep an eye to:

   - https://github.com/apache/beam-starter-java -- 9 updates, all tests
   passing
   - https://github.com/apache/beam-starter-python -- 2 updates, all tests
   passing
   - https://github.com/apache/beam-starter-go -- 0 updates
   - https://github.com/apache/beam-starter-kotlin -- 3 updates, all tests
   passing
   - https://github.com/apache/beam-starter-scala -- not done yet, but keep
   an eye


Beam Dependency Check Report (2022-10-27)

2022-10-27 Thread Apache Jenkins Server
<<< text/html; charset=UTF-8: Unrecognized >>>


Beam High Priority Issue Report (45)

2022-10-27 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need 
attention.

See https://beam.apache.org/contribute/issue-priorities for the meaning and 
expectations around issue priorities.

Unassigned P1 Issues:

https://github.com/apache/beam/issues/23815 [Bug]: Neo4j tests failing
https://github.com/apache/beam/issues/23745 [Bug]: Samza 
AsyncDoFnRunnerTest.testSimplePipeline is flaky
https://github.com/apache/beam/issues/23709 [Flake]: Spark batch flakes in 
ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElement and 
ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundle
https://github.com/apache/beam/issues/22969 Discrepancy in behavior of 
`DoFn.process()` when `yield` is combined with `return` statement, or vice versa
https://github.com/apache/beam/issues/22321 
PortableRunnerTestWithExternalEnv.test_pardo_large_input is regularly failing 
on jenkins
https://github.com/apache/beam/issues/21713 404s in BigQueryIO don't get output 
to Failed Inserts PCollection
https://github.com/apache/beam/issues/21561 
ExternalPythonTransformTest.trivialPythonTransform flaky
https://github.com/apache/beam/issues/21469 beam_PostCommit_XVR_Flink flaky: 
Connection refused
https://github.com/apache/beam/issues/21462 Flake in 
org.apache.beam.sdk.io.mqtt.MqttIOTest.testReadObject: Address already in use
https://github.com/apache/beam/issues/21261 
org.apache.beam.runners.dataflow.worker.fn.logging.BeamFnLoggingServiceTest.testMultipleClientsFailingIsHandledGracefullyByServer
 is flaky
https://github.com/apache/beam/issues/21260 Python DirectRunner does not emit 
data at GC time
https://github.com/apache/beam/issues/21123 Multiple jobs running on Flink 
session cluster reuse the persistent Python environment.
https://github.com/apache/beam/issues/21113 
testTwoTimersSettingEachOtherWithCreateAsInputBounded flaky
https://github.com/apache/beam/issues/20976 
apache_beam.runners.portability.flink_runner_test.FlinkRunnerTestOptimized.test_flink_metrics
 is flaky
https://github.com/apache/beam/issues/20975 
org.apache.beam.runners.flink.ReadSourcePortableTest.testExecution[streaming: 
false] is flaky
https://github.com/apache/beam/issues/20974 Python GHA PreCommits flake with 
grpc.FutureTimeoutError on SDK harness startup
https://github.com/apache/beam/issues/20689 Kafka commitOffsetsInFinalize OOM 
on Flink
https://github.com/apache/beam/issues/20108 Python direct runner doesn't emit 
empty pane when it should
https://github.com/apache/beam/issues/19814 Flink streaming flakes in 
ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundleStateful and 
ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful
https://github.com/apache/beam/issues/19734 
WatchTest.testMultiplePollsWithManyResults flake: Outputs must be in timestamp 
order (sickbayed)
https://github.com/apache/beam/issues/19241 Python Dataflow integration tests 
should export the pipeline Job ID and console output to Jenkins Test Result 
section


P1 Issues with no update in the last week:

https://github.com/apache/beam/issues/23627 [Bug]: Website precommit flaky
https://github.com/apache/beam/issues/23489 [Bug]: add DebeziumIO to the 
connectors page
https://github.com/apache/beam/issues/23306 [Bug]: BigQueryBatchFileLoads in 
python loses data when using WRITE_TRUNCATE
https://github.com/apache/beam/issues/23286 [Bug]: 
beam_PerformanceTests_InfluxDbIO_IT Flaky > 50 % Fail 
https://github.com/apache/beam/issues/22913 [Bug]: 
beam_PostCommit_Java_ValidatesRunner_Flink is flakes in 
org.apache.beam.sdk.transforms.GroupByKeyTest$BasicTests.testAfterProcessingTimeContinuationTriggerUsingState
https://github.com/apache/beam/issues/22891 [Bug]: 
beam_PostCommit_XVR_PythonUsingJavaDataflow is flaky
https://github.com/apache/beam/issues/22605 [Bug]: Beam Python failure for 
dataflow_exercise_metrics_pipeline_test.ExerciseMetricsPipelineTest.test_metrics_it
https://github.com/apache/beam/issues/22115 [Bug]: 
apache_beam.runners.portability.portable_runner_test.PortableRunnerTestWithSubprocesses
 is flaky
https://github.com/apache/beam/issues/22011 [Bug]: 
org.apache.beam.sdk.io.aws2.kinesis.KinesisIOWriteTest.testWriteFailure flaky
https://github.com/apache/beam/issues/21893 [Bug]: BigQuery Storage Write API 
implementation does not support table partitioning
https://github.com/apache/beam/issues/21714 
PulsarIOTest.testReadFromSimpleTopic is very flaky
https://github.com/apache/beam/issues/21711 Python Streaming job failing to 
drain with BigQueryIO write errors
https://github.com/apache/beam/issues/21709 
beam_PostCommit_Java_ValidatesRunner_Samza Failing
https://github.com/apache/beam/issues/21708 beam_PostCommit_Java_DataflowV2, 
testBigQueryStorageWrite30MProto failing consistently
https://github.com/apache/beam/issues/21707 GroupByKeyTest BasicTests 
testLargeKeys100MB flake (on ULR)
https://github.com/apache/beam/issues/21706 Flaky timeout in github Python unit 
test action