Re: Why is my RabbitMq message never acknowledged ?

2019-06-14 Thread Ismaël Mejía
Is there a JIRA for this ? if this solves an issue to multiple users maybe is worth of integrating the patch. Would you be up to do this Augustin? On Fri, Jun 14, 2019 at 10:35 AM Augustin Lafanechere wrote: > > Hello Nicolas, > I also encountered the same problem. > RabbitMQIo indeed acknowledge

Re: AvroIO read SpecificRecord with custom reader schema?

2019-06-14 Thread Ismaël Mejía
> There is an alternative approach with getting PCollection using AvroIO > and then converting Row to SpecificRecord using new Schemas APIs Yes but this will have a higher run time, no ? Maybe worth a JIRA + PR for this feature. On Fri, Jun 14, 2019 at 10:15 AM Gleb Kanterov wrote: > > There is

Re: [ANNOUNCE] Spark portable runner (batch) now available for Java, Python, Go

2019-06-18 Thread Ismaël Mejía
I have been thrilled from seeing from the first row this happening. Thanks a lot Kyle. Excellent work! On Mon, Jun 17, 2019 at 9:15 PM Ankur Goenka wrote: > > Thanks Kyle! > This is a great addition towards supporting portability on Beam. > > On Mon, Jun 17, 2019 at 9:21 AM Ahmet Altay wrote:

Re: gRPC method to get a pipeline definition?

2019-06-26 Thread Ismaël Mejía
+1 don't hesitate to create a JIRA + PR. You may be interested in [1]. This is a simple util class that takes a proto pipeline object and converts it into its graph representation in .dot format. You can easily reuse the code or the idea as a first approach to show what the pipeline is about. [1]

Re: [Python] Read Hadoop Sequence File?

2019-07-02 Thread Ismaël Mejía
(Adding dev@ and Solomon Duskis to the discussion) I was not aware of these thanks for sharing David. Definitely it would be a great addition if we could have those donated as an extension in the Beam side. We can even evolve them in the future to be more FileIO like. Any chance this can happen? M

Re: [Python] Read Hadoop Sequence File?

2019-07-03 Thread Ismaël Mejía
ow to make them coexist with HadoopFormatIO > though. > > > On Tue, Jul 2, 2019 at 10:55 AM Solomon Duskis wrote: >> >> +Igor Bernstein who wrote the Cloud Bigtable Sequence File classes. >> >> Solomon Duskis | Google Cloud clients | sdus...@google.com | 914-46

Re: Beam KinesisIO Migration V1 to V2

2019-10-01 Thread Ismaël Mejía
+dev On Tue, Oct 1, 2019 at 8:35 PM Ismaël Mejía wrote: > > Thanks a lot Cam for bringing this document to the mailing list (I let > some comments there). There was a recent proposal doc about supporting > async on Beam so you can be interested on taking a look at the > evolution

Re: Beam KinesisIO Migration V1 to V2

2019-10-01 Thread Ismaël Mejía
Thanks a lot Cam for bringing this document to the mailing list (I let some comments there). There was a recent proposal doc about supporting async on Beam so you can be interested on taking a look at the evolution of that [1]. It is definitely interesting for the implications for IO authors. I pi

[CVE-2020-1929] Apache Beam MongoDB IO connector disables certificate trust verification

2020-01-15 Thread Ismaël Mejía
CVE-2020-1929 Apache Beam MongoDB IO connector disables certificate trust verification Severity: Major Vendor: The Apache Software Foundation Versions Affected: Apache Beam 2.10.0 to 2.16.0 Description: The Apache Beam MongoDB connector in versions 2.10.0 to 2.16.0 has an option to disable SSL t

Re: Kafka Avro Schema Registry Support

2020-02-04 Thread Ismaël Mejía
Support for Confluent Schema Registry was merged into KafkaIO today. You can test it with tomorrow's snapshots (version 2.20.0-SNAPSHOT) or just when 2.20.0 gets released. Notice that this was already possible, but Alexey took care of making this more user friendly because this is (was) a frequentl

Re: Running a Beam Pipeline on GCP Dataproc Flink Cluster

2020-02-07 Thread Ismaël Mejía
+user@beam.apache.org On Fri, Feb 7, 2020 at 12:54 AM Xander Song wrote: > I am attempting to run a Beam pipeline on a GCP Dataproc Flink cluster. I > have followed the instructions at this repo > > to > create

Re: Beam 2.19.0 / Flink 1.9.1 - Session cluster error when submitting job "Multiple environments cannot be created in detached mode"

2020-02-24 Thread Ismaël Mejía
We are cutting the release branch for 2.20.0 next wednesday, so not sure if these tickets will make it, but hopefully. For ref, BEAM-9295 Add Flink 1.10 build target and Make FlinkRunner compatible with Flink 1.10 BEAM-9299 Upgrade Flink Runner to 1.8.3 and 1.9.2 In any case if you have cycles to

Re: Unbounded input join Unbounded input then write to Bounded Sink

2020-02-25 Thread Ismaël Mejía
Hello, Sinks are not bounded or unbounded, they are just normal ParDos (DoFns) that behave consistently with the pipeline data, so if your pipeline deals with unbounded data the sink will write this data correspondingly (when windows close, triggers match, etc so data is ready to be out). One pat

Re: Beam 2.19.0 / Flink 1.9.1 - Session cluster error when submitting job "Multiple environments cannot be created in detached mode"

2020-02-26 Thread Ismaël Mejía
ting / filing / collecting the issues. >>>> >>>> There is a fix pending: https://github.com/apache/beam/pull/10950 >>>> >>>> As for the upgrade issues, the 1.8 and 1.9 upgrade is trivial. I will >>>> check out the Flink 1.10 PR tomorrow. &

Re: Beam 2.19.0 / Flink 1.9.1 - Session cluster error when submitting job "Multiple environments cannot be created in detached mode"

2020-02-27 Thread Ismaël Mejía
t Beam jar. I am still investigating this. >> >> How can I test the Flink 1.10 runners? (The following POM is not >> resolvable by maven) >> >> >> org.apache.beam >> beam-runners-flink-1.10 >> 2.20-SNAPSHOT &

Re: Problem with Classgraph in Beam 2.19

2020-03-05 Thread Ismaël Mejía
Can you please create a JIRA for this issue. On Thu, Mar 5, 2020 at 2:35 PM Péter Farkas wrote: > Hi, > > I'm trying to upgrade to version 2.19, but when I try to test it I keep > getting this error: > Exception in thread "main" java.lang.RuntimeException: Failed to construct > instance from fac

Re: Problem with Classgraph in Beam 2.19

2020-03-05 Thread Ismaël Mejía
ter Farkas wrote: > >> BEAM-9452 >> >> On Thu, 5 Mar 2020 at 15:04, Ismaël Mejía wrote: >> >>> Can you please create a JIRA for this issue. >>> >>> On Thu, Mar 5, 2020 at 2:35 PM Péter Farkas >>> wrote: >>> >>&

Re: Problem with Classgraph in Beam 2.19

2020-03-05 Thread Ismaël Mejía
ne and that you guys are looking > into the other issue. These have so far been blockers for us to move > forward from 2.17. > > Best, > Kjetil > > On Thu, Mar 5, 2020 at 4:27 PM Ismaël Mejía wrote: > >> Thanks Péter for bringing this info and creating the issue, tha

Re: Problem with Classgraph in Beam 2.19

2020-03-05 Thread Ismaël Mejía
BEAM-9452 has been solved today the fix will be included in the next release 2.20.0 Thanks again for reporting Péter On Thu, Mar 5, 2020 at 5:25 PM Ismaël Mejía wrote: > Oh we were not aware of that issue, next time do not hesitate to let us > know in advance. > > On Thu, Mar 5, 202

Re: Hello Beam Community!

2020-03-13 Thread Ismaël Mejía
Welcome ! On Fri, Mar 13, 2020 at 3:00 PM Connell O'Callaghan wrote: > Welcome Brittany > > On Fri, Mar 13, 2020 at 6:45 AM Rustam Mehmandarov > wrote: > >> Welcome Brittany! >> >> Cheers, >> Rustam >> @rmehmandarov >> >> On Fri, Mar 13, 2020 at 2:31 AM Br

Re: Meetups

2020-03-25 Thread Ismaël Mejía
That sounds like a great idea. We can do (and hopefully record) a virtual meetup. Have you checked a the 'logistics' for this? I think Alex Van Boxel was looking for something similar so maybe he can share his findings https://twitter.com/alexvb/status/1239816659763953664?s=20 Any volunteers for s

Re: Running NexMark Tests

2020-04-21 Thread Ismaël Mejía
You need to instruct the Flink runner to shutdown the the source otherwise it will stay waiting. You can this by adding the extra argument`--shutdownSourcesOnFinalWatermark=true` And if that works and you want to open a PR to update our documentation that would be greatly appreciated. Regards, Ism

Re: Stateful & Timely Call

2020-04-24 Thread Ismaël Mejía
Sounds like a good addition to the Beam patterns page Reza :) On Fri, Apr 24, 2020 at 3:22 AM Aniruddh Sharma wrote: > > Thanks Robert, > > This is a life saver and its a great help :). It works like a charm. > > Thanks > Aniruddh > > On Thu, Apr 23, 2020 at 4:45 PM Robert Bradshaw wrote: >> >>

Re: Running Nexmark for Flink Streaming

2020-04-28 Thread Ismaël Mejía
Max would it make sense to make the rocksdb runtime only at the runner level just to hint its use. I assume that most Flink users might want to have RocksDB as the default state backend? runtimeOnly "org.apache.flink:flink-statebackend-rocksdb_2.11:$flink_version" On Tue, Apr 28, 2020 at 1:1

Re: [DISCUSS] Deprecation of AWS SDK v2 IO connectors

2020-09-15 Thread Ismaël Mejía
The reason why most people are using AWSv1 IOs is probably because they are in Beam since 2017 instead of just added in the last year which is the case of the AWSv2 ones. Alexey mentions that maintaining both versions is becoming painful and I would like to expand on that because we have now dupli

Re: Support streaming side-inputs in the Spark runner

2020-10-05 Thread Ismaël Mejía
The limitation of non being able to have side inputs in streaming has been pending since a long time ago, and there is sadly not anyone to my knowledge working on this. One extra aspect to have in mind is that the support for streaming in the Spark runner uses the Spark DStream API which does not

Combine with multiple outputs case Sample and the rest

2020-12-18 Thread Ismaël Mejía
I had a question today from one of our users about Beam’s Sample transform (a Combine with an internal top-like function to produce a uniform sample of size n of a PCollection). They wanted to obtain also the rest of the PCollection as an output (the non sampled elements). My suggestion was to use

Re: Combine with multiple outputs case Sample and the rest

2020-12-23 Thread Ismaël Mejía
r the fact should work well (unless there's duplicate elements, in > which case you'd have to uniquify them somehow to filter out only the "right" > copies). > > - Robert > > > > On Fri, Dec 18, 2020 at 8:20 AM Ismaël Mejía wrote: >> >> I had

Re: [VOTE] Release 2.27.0, release candidate #1

2020-12-24 Thread Ismaël Mejía
It might be a good idea to include also: [BEAM-11403] Cache UnboundedReader per UnboundedSourceRestriction in SDF Wrapper DoFn https://github.com/apache/beam/pull/13592 So Java development experience is less affected (as with 2.26.0) (There is a flag to exclude but defaults matter). On Thu, Dec

Re: beam and compatible Flink Runner versions

2020-12-28 Thread Ismaël Mejía
It seems the website has not been updated with the latest changes. I just opened https://github.com/apache/beam/pull/13620/files for this Just to clarify the newer versions of Flink are supported in these versions: 2.21.0 - 2.24.0 Support for Flink 1.10.0 2.25.0 - 2.26.0 Support for Flink 1.11.0

Re: [VOTE] Release 2.27.0, release candidate #1

2020-12-28 Thread Ismaël Mejía
alidations. I'm cancelling this RC, and >> I'll perform cherry picks to prepare the next one. >> >> Please update this thread with any other cherry pick requests! >> -P. >> >> On Thu, Dec 24, 2020, 3:17 AM Ismaël Mejía wrote: >>> >>> It mi

Re: Quick question regarding ParquetIO

2021-01-18 Thread Ismaël Mejía
Catching up on this thread sorry if late to the party :) and my excuses because this is going to be lng but worth. > It does look like BEAM-11460 could work for you. Note that relies on a dynamic > object which won't work with schema-aware transforms and SqlTransform. It's > likely this isn't

Re: Apache Beam's UX Research Findings Readout

2021-02-12 Thread Ismaël Mejía
Is there a recorded version of this presentation? For the ones that missed it. On Thu, Feb 11, 2021 at 6:06 PM Carlos Camacho wrote: > > Hi everyone, > This is a friendly reminder to join the UX Research Findings Readout. > > We are live now! Join us: ⁨https://meet.google.com/xfc-majk-byk⁩ > > --

Re: [DISCUSS] Drop support for Flink 1.8 and 1.9

2021-03-11 Thread Ismaël Mejía
+user > Should we add a warning or something to 2.29.0? Sounds like a good idea. On Thu, Mar 11, 2021 at 7:24 PM Kenneth Knowles wrote: > > Should we add a warning or something to 2.29.0? > > On Thu, Mar 11, 2021 at 10:19 AM Ismaël Mejía wrote: >> >> Hello, &g

Re: [DISCUSS] Drop support for Flink 1.8 and 1.9

2021-03-12 Thread Ismaël Mejía
rote: >> >> +1 >> >> D. >> >> On Thu, Mar 11, 2021 at 8:33 PM Ismaël Mejía wrote: >>> >>> +user >>> >>> > Should we add a warning or something to 2.29.0? >>> >>> Sounds like a good idea. >>> >

Re: Write to multiple IOs in linear fashion

2021-03-24 Thread Ismaël Mejía
+dev Since we all agree that we should return something different than PDone the real question is what should we return. As a reminder we had a pretty interesting discussion about this already in the past but uniformization of our return values has not happened. This thread is worth reading for Vi

Re: Extremely Slow DirectRunner

2021-05-08 Thread Ismaël Mejía
Can you try running direct runner with the option `--experiments=use_deprecated_read` Seems like an instance of https://issues.apache.org/jira/browse/BEAM-10670?focusedCommentId=17316858&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17316858 also reported in https:/

[DISCUSS] Drop support for Flink 1.10

2021-05-28 Thread Ismaël Mejía
Hello, With Beam support for Flink 1.13 just merged it is the time to discuss the end of support for Flink 1.10 following the agreed policy on supporting only the latest three Flink releases [1]. I would like to propose that for Beam 2.31.0 we stop supporting Flink 1.10 [2]. I prepared a PR for t

Re: No filesystem found for scheme hdfs

2021-05-31 Thread Ismaël Mejía
You probably need to include the beam-sdks-java-io-hadoop-file-system module. On Mon, May 31, 2021 at 11:41 AM Gershi, Noam wrote: > Hi > > > > I am using Spark-runner, and when I am using Apache Beam TextIO to read a > file from HDFS: > > > > .apply(TextIO.read().from(“hdfs://path-to-file”) >

Re: Provider com.fasterxml.jackson.module.jaxb.JaxbAnnotationModule not a subtype

2021-06-05 Thread Ismaël Mejía
Hello, seems to be a known issue: https://issues.apache.org/jira/browse/BEAM-10430 I don't know however if someone has already find a proper fix or workaround. On Fri, Jun 4, 2021 at 8:22 PM Trevor Kramer wrote: > > Relating to my earlier message I sometimes get this error instead. > > java.uti

Re: KafkaIO: reset topic for reading from the start with every run

2017-01-24 Thread Ismaël Mejía
One extra reminder, if you use the DirectRunner you can set the DirectOptions to make the validations of the runner loose (and gain some speed improvements). setEnforceImmutability(false) setEnforceEncodability(false) On Mon, Jan 23, 2017 at 8:22 PM, Gareth Western wrote: > Thanks Thomas. I'll

[no subject]

2017-02-08 Thread Ismaël Mejía
Hello, I was testing a pipeline that produces SessionWindows and then calculates a Mean afterwards in 'batch' mode and I found this issue while running with the Flink Runner. ​17/02/08 09:27:24 INFO org.apache.beam.runners.flink.translation.FlinkBatchPipelineTranslator: | | | | visitPrimiti

Re:

2017-02-09 Thread Ismaël Mejía
gt; > Cheers, > Aljoscha > > On Wed, 8 Feb 2017 at 12:16 Ismaël Mejía wrote: > >> Hello, >> >> I was testing a pipeline that produces SessionWindows and then calculates >> a Mean afterwards in 'batch' mode and I found this issue while ru

Re: New blog post: "Stateful processing with Apache Beam"

2017-02-15 Thread Ismaël Mejía
Great post, I like the use of the previous figure style with geometric forms and colors, as well as the table analogy that really helps to understand the concepts. I am still digesting some of the consequences of the State API, in particular the implications of using state that you mention at the e

Re: Approach to writing to Redis in Streaming Pipeline

2017-03-16 Thread Ismaël Mejía
Hello, Probably it is not worth the effort to write a new RedisIO from zero considering there is an ongoing Pull Request for this. https://github.com/apache/beam/pull/1687 Maybe you can take a look if the current WIP is enough for your needs, and eventually give a hand there to improve it if it

Re: automatic runner inference

2017-04-04 Thread Ismaël Mejía
Antony, You can do this explicitly when building your pipeline from the command args: Options options = PipelineOptionsFactory.fromArgs(args).withValidation().as(Options.class); and when you run your app you pass --runner=YourFavoriteRunner and it will resolve, however different runners can need

Re: Apache Beam Slack channel

2017-04-28 Thread Ismaël Mejía
Done. On Fri, Apr 28, 2017 at 3:32 PM, Andrew Psaltis wrote: > Please add me as well. Thanks, > > On Fri, Apr 28, 2017 at 7:59 AM, Anuj Kumar wrote: > >> Thanks >> >> On Fri, Apr 28, 2017 at 3:56 PM, Aviem Zur wrote: >> >>> Invitation sent. >>> >>> On Fri, Apr 28, 2017 at 1:24 PM Anuj Kumar w

Re: Slack channel invite

2017-05-02 Thread Ismaël Mejía
Done. On Tue, May 2, 2017 at 1:05 PM, Josh Di Fabio wrote: > Please will someone kindly invite joshdifa...@gmail.com to the Beam slack > channel?

Re: Beam Slack channel

2017-06-29 Thread Ismaël Mejía
Invitation sent! On Thu, Jun 29, 2017 at 9:16 AM, Patrick Reames wrote: > Can i also get an invite? > > On 2017-06-25 08:51 (-0500), Aleksandr wrote: >> Hello,> >> Can someone please add me to the slack channel?> >> >> Best regards> >> Aleksandr Gortujev.> >> >

Re: Failed to run Wordcount example

2017-08-16 Thread Ismaël Mejía
Hello, The error message shows that it is looking for the Beam 0.1 version and that version does not exist in maven central. You have to replace the version of Beam in the command you executed with the latest version that means 2.0.0 at this moment and it should work. Regards, Ismaël On Wed, Au

Re: Failed to run Wordcount example

2017-08-16 Thread Ismaël Mejía
st the way on the quick start page. It seems that the example somehow > takes its own version as some beam dependency's version accidentally. > > BTW, I'm using the latest master branch. > > Thanks, > Huafeng > > > Ismaël Mejía 于2017年8月16日周三 下午3:57写道: >>

[VOTE] [DISCUSSION] Remove support for Java 7

2017-10-17 Thread Ismaël Mejía
We have discussed recently in the developer mailing list about the idea of removing support for Java 7 on Beam. There are multiple reasons for this: - Java 7 has not received public updates for almost two years and most companies are moving / have already moved to Java 8. - A good amount of the sy

Re: [VOTE] [DISCUSSION] Remove support for Java 7

2017-10-18 Thread Ismaël Mejía
gt;>> > This >>> > path could perhaps be considered if we had evidence that switching to a >>> > Beam >>> > release without Java7 support would require 0 work for an overwhelming >>> > majority of users) >>> > >>> > >&

Re: [VOTE] [DISCUSSION] Remove support for Java 7

2017-10-18 Thread Ismaël Mejía
> +1 > > - > Srinivas > > - Typed on tiny keys. pls ignore typos.{mobile app} > > On 17-Oct-2017 9:47 PM, "Ismaël Mejía" wrote: >> >> We have discussed recently in the developer mailing list about the >> idea of removing support for Java 7 on Beam. There

Re: IBM Streams now supports Apache Beam Java applications

2017-11-08 Thread Ismaël Mejía
Congratulations, this is a nice feature for the IBM Cloud and of course great news for the Apache Beam community. Do you have specific IBM specific IOs? I noticed you guys have an implementation of the OpenStack's Swift FileSystem as part of your SDK. Any plans to contribute this or other parts in

Re: BEAM counters for validation

2017-11-27 Thread Ismaël Mejía
Thanks for bringing this question Holden. I have also been thinking about this for a while and I have the impression that Beam needs to expose more ‘system’ metrics to the users, so far we have mostly cared about filling the user-defined metrics space. However once anyone starts using Beam it is no

Re: Reading from ORC Files in HDFS

2017-12-18 Thread Ismaël Mejía
Hello, There is not support yet to read ORC files directly on Beam, You can track the progress of this issue here. https://issues.apache.org/jira/browse/BEAM-1861 You better use HCatalogIO than JdbcIO (the split should be better). On Mon, Dec 18, 2017 at 4:17 AM, Allan Wilson wrote: > Hi, >

Re: Question on basic version changes

2018-01-15 Thread Ismaël Mejía
Hello, If you have a concrete proposal you can send it to the dev@ mailing list. The common procedure is to share a google docs document and people will comment on it. This + mailing list are the preferred mechanisms, also if you prefer some 'real time' interactivity you can also go into the slack

Re: Strata Conference this March 6-8

2018-01-16 Thread Ismaël Mejía
Maybe a good idea to try to organize a Beam meetup in london in the same dates in case some of the people around can jump in and talk too. On Wed, Jan 17, 2018 at 2:51 AM, Ron Gonzalez wrote: > Works for me... > > On Tuesday, January 16, 2018, 5:45:33 PM PST, Holden Karau > wrote: > > > How woul

Re: Strata Conference this March 6-8

2018-01-18 Thread Ismaël Mejía
of the Beam >>> London Meetup) can help us to plan something. >>> >>> Regards >>> JB >>> >>> >>> On 01/17/2018 08:57 AM, Ismaël Mejía wrote: >>>> >>>> Maybe a good idea to try to organize a Beam meetup in london i

Re: Deprecate and remove support for Kafka 0.9.x and older version

2018-02-06 Thread Ismaël Mejía
Agree with JB, showing deprecation is ok, but I think it is worth to support Kafka 0.9.x for some extra time. Users tend to stay in old data stores because migrating these clusters isn't always so easy. On Tue, Feb 6, 2018 at 3:56 PM, Jean-Baptiste Onofré wrote: > +1 to flag as deprecated, but I

Re: Regarding Beam SlackChannel

2018-02-15 Thread Ismaël Mejía
Done, you will receive and email, welcome! On Thu, Feb 15, 2018 at 2:33 PM, Willy Lulciuc wrote: > Hello: > > Can someone please add me to the Beam slackchannel? > > Thanks.

Re: Get file from S3 bucket

2018-02-22 Thread Ismaël Mejía
Hello, Beam 2.3.0 introduced a native reader for S3, see the module https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-java-io-amazon-web-services/2.3.0 You should add this dependency to your project and then you can read using the Read transform. It supports authentication so you can r

Re: "Radically modular data ingestion APIs in Apache Beam" @ Strata - slides available

2018-03-08 Thread Ismaël Mejía
Excellent, loved the 'Nobody writes a paper about their IO API'. IO is such an important but less valued part of Big Data, kind of ironic. Great work Eugene ! On Thu, Mar 8, 2018 at 9:40 PM, Kenneth Knowles wrote: > Love it. Great flashy title, too :-) > > On Thu, Mar 8, 2018 at 12:16 PM Eugene K

Re: Slack Invite

2018-04-26 Thread Ismaël Mejía
Done, welcome! On Tue, Apr 24, 2018 at 5:47 PM, Preston Marshall wrote: > Can someone send me an invite to the Slack? The invite link on the website > is broken.

Re: Normal Spark Streaming vs Streaming on Beam with Spark Runner

2018-05-16 Thread Ismaël Mejía
Hello, Answers to the questions inline: > 1. Are there any limitations in terms of implementations, functionalities or performance if we want to run streaming on Beam with Spark runner vs streaming on Spark-Streaming directly ? At this moment the Spark runner does not support some parts of the B

Re: I'm back and ready to help grow our community!

2018-05-22 Thread Ismaël Mejía
I missed somehow this email thread. Congratulations Gris and welcome back! On Fri, May 18, 2018 at 5:34 AM Jesse Anderson wrote: > Congrats! > On Thu, May 17, 2018, 6:44 PM Robert Burke wrote: >> Congrats & welcome back! >> On Thu, May 17, 2018, 5:44 PM Huygaa Batsaikhan wrote: >>> Welcome

Re: Documentation for Beam on Windows

2018-06-01 Thread Ismaël Mejía
Is there a JIRA for this? Can you create one Udi? On Wed, May 23, 2018 at 11:32 PM Lukasz Cwik wrote: > > There is none to my knowledge. > > On Wed, May 23, 2018 at 1:49 PM Udi Meiri wrote: >> >> Hi all, >> >> I was looking yesterday for a quickstart guide on how to use Beam on Windows >> but sa

Re: kafkaIO Run with Spark Runner: "streaming-job-executor-0"

2018-06-13 Thread Ismaël Mejía
Can you please update the version of Beam to at least version 2.2.0. There were some important fixes in streaming after the 2.0.0 release so this could be related. Ideally you should use the latest released version (2.4.0). Remember that starting with Beam 2.3.0 the Spark runner is based on Spark 2

Re: [FYI] New Apache Beam Swag Store!

2018-06-13 Thread Ismaël Mejía
Great ! Thanks Gris and Matthias for putting this in place. Hope to get that hoodie soon. As a suggestion, more colors too, and eventually a t-shirt just with the big B logo. On Mon, Jun 11, 2018 at 6:50 PM Mikhail Gryzykhin wrote: > > That's nice! > > More colors are appreciated :) > > --Mikhail

Re: Go SDK: How are re-starts handled?

2018-06-27 Thread Ismaël Mejía
Eduardo can you please create a JIra on the Go SDK to track this issue. Thanks. On Mon, Jun 25, 2018 at 10:22 PM Lukasz Cwik wrote: > Ah, sorry for the confusion. The SDK is meant to handle that for you as I > described. You'll want to use the fact that the 409 was returned until that > is imple

Re: about PCollection process

2018-07-06 Thread Ismaël Mejía
Hello, If I understood correctly you read from a file some parameters that you are going to use to prepare an HBase Scan. If this is the case you cannot do this with the current HBaseIO API, but there is ongoing work to support this transparently with the new SDF API. If you want to track the prog

Re: [ANNOUNCE] Apache Beam 2.6.0 released!

2018-08-09 Thread Ismaël Mejía
Two really interesting features in 2.6.0 not mentioned in the announcement email: - Bounded SplittableDoFn support is available now in all runners (SDF is the new IO connector API). - HBaseIO was updated to be the first IO supporting Bounded SDF (using readAll). On Fri, Aug 10, 2018 at 12:14

Re: [Discuss] Upgrade story for Beam's execution engines

2018-09-17 Thread Ismaël Mejía
In the Spark runner the user provides the core spark dependencies at runtime and we assume that backwards compatibility is kept (in upstream Spark). We support the whole 2.x line but we try to keep the version close to the latest stable release. Notice however that we lack tests to validate that a

Modular IO presentation at Apachecon

2018-09-26 Thread Ismaël Mejía
Hello, today Eugene and me did a talk about about modular APIs for IO at ApacheCon. This talk introduces some common patterns that we have found while creating IO connectors and also presents recent ideas like dynamic destinations, sequential writes among others using FileIO as a use case. In case

Re: Issue with GroupByKey in BeamSql using SparkRunner

2018-10-10 Thread Ismaël Mejía
Are you trying this in a particular spark distribution or just locally ? I ask this because there was a data corruption issue with Spark 2.3.1 (previous version used by Beam) https://issues.apache.org/jira/browse/SPARK-23243 Current Beam master (and next release) moves Spark to version 2.3.2 and t

Re: RabbitMqIO missing in Maven Central

2018-11-08 Thread Ismaël Mejía
Hello, RabbitMQ was merged into master after the 2.8.0 release, so you will have to wait until 2.9.0 is released, or compile/package it by yourself. Regards, Ismaël On Thu, Nov 8, 2018 at 10:10 PM Jeroen Steggink | knowsy wrote: > > Hi guys, > > I tried getting the new RabbitMqIO, however, it's

Re: Moving to spark 2.4

2018-12-07 Thread Ismaël Mejía
Hello Vishwas, The spark dependency in the spark runner is provided so you can already pass the dependencies of spark 2.4 and it should work out of the box. JB did a PR to upgrade the version of Spark in the runner, but maybe it is worth to wait a bit before merging it, at least until some of the

Re: Latin America Community

2018-12-07 Thread Ismaël Mejía
Hello, It is a great idea to try to grow the community in the region. Notice that already there are multiple latino members in the dev community (e.g. Pablo, Gris and me). However no Brasilians so far, so glad that you want to be part. I suppose that given Sao Paulo's size it is probably the 'eas

Re: Moving to spark 2.4

2018-12-07 Thread Ismaël Mejía
ks & Regards, > Vishwas > > > On Fri, Dec 7, 2018 at 2:53 PM Ismaël Mejía wrote: >> >> Hello Vishwas, >> >> The spark dependency in the spark runner is provided so you can >> already pass the dependencies of spark 2.4 and it should work out of >>

Re: Recordings and presentations from Beam Summit London 2018

2018-12-21 Thread Ismaël Mejía
Thanks for sharing Mathias, I also did not assist and I really want to see some of the presentations. Great timing just before the holidays! Regards, Ismaël On Fri, Dec 21, 2018 at 2:54 PM OrielResearch Eila Arich-Landkof < e...@orielresearch.org> wrote: > Thank you for sharing. will definitel

Re: Suggestion or Alternative simples to read file from FTP

2019-01-08 Thread Ismaël Mejía
There was some time ago a PR that added support to VFS with this at least theorically we could support FTP and even HTTP listings. However this needs some love to rebase and update it, but if you are interested on hacking on this don't hesitate to cohtact me and I can mentor you to do so or eventua

Re: Beam Python streaming pipeline on Flink Runner

2019-01-31 Thread Ismaël Mejía
> Fortunately, there is already a pending PR for cross-language pipelines which > will allow us to use Java IO like PubSub in Python jobs. In addition to have support in the runners, this will require a rewrite of PubsubIO to use the new SDF API. On Thu, Jan 31, 2019 at 12:23 PM Maximilian Michel

Re: Beam Python streaming pipeline on Flink Runner

2019-01-31 Thread Ismaël Mejía
te of PubsubIO to use the new SDF API. > > Not necessarily. This would be one way. Another way is build an SDF wrapper > for > UnboundedSource. Probably the easier path for migration. > > On 31.01.19 14:03, Ismaël Mejía wrote: > >> Fortunately, there is already a pen

Re: joda-time dependency version

2019-03-20 Thread Ismaël Mejía
Hello, The long term goal would be to get rid of joda-time but that won't happen until Beam 3. Any 'particular' reason or motivation to push the upgrade? Regards, Ismaël On Wed, Mar 20, 2019 at 11:53 AM rahul patwari wrote: > > Hi, > > Is there a plan to upgrade the dependency version of joda-t

Re: joda-time dependency version

2019-03-21 Thread Ismaël Mejía
We are using Beam with Spark Runner and Spark 2.4 has joda-time 2.9.3 as a >>> dependency. So, we have used joda-time 2.9.3 in our shaded artifact set. As >>> Beam has joda-time 2.4 as a dependency, I was wondering whether it would >>> break anything in Beam. >>>

Re: joda-time dependency version

2019-03-21 Thread Ismaël Mejía
Does anyone have any context on why we have such an old version of Joda time (2.4 released on 2014!) and if there is any possible issue upgrading it? If not maybe we can try to upgrade it.. On Thu, Mar 21, 2019 at 5:35 PM Ismaël Mejía wrote: > > Mmmm interesting issue. There is also a p

Re: Do I have any control over bundle sizes?

2019-04-04 Thread Ismaël Mejía
It seems you can 'hack' it with the State API. See the discussion on this ticket: https://issues.apache.org/jira/browse/BEAM-6886 On Thu, Apr 4, 2019 at 9:42 PM Jeff Klukas wrote: > > As far as I can tell, Beam expects runners to have full control over > separation of individual elements into bu

Re: Couchbase

2019-04-08 Thread Ismaël Mejía
Hello, Guobao is working on this, but he is OOO at least until end of next week so if you can wait it will be available 'soon'. If you need this urgently and you decide to write your own implementation of write, it would be a valuable contribution that I will be happy to review. Regards, Ismaël

Re: Beam's HCatalogIO for Hive Parquet Data

2019-05-02 Thread Ismaël Mejía
Hello, Support for Parquet in HCatalog (Hive) started on version 3.0.0 HIVE-8838 Support Parquet through HCatalog https://issues.apache.org/jira/browse/HIVE-8838?attachmentSortBy=dateTime The current version used on Beam is 2.1.0. I filled a new JIRA to tackle it BEAM-7209 - Update HCatalogIO to

Re: Beam Website Feedback

2022-02-27 Thread Ismaël Mejía
Hello Abe, Can you check if you are subscribed to the user mailing list, it seems you are to dev@ maybe the issue is the missing user@ suscription. You can do this by sending an email to: user-subscr...@beam.apache.org Ahmet and the others, it might be a good idea to mention the user@ mailing lis