My apologizes, I thought we had a consensus already.
Regards
JB
On 12/04/2017 11:22 PM, Eugene Kirpichov wrote:
Thanks JB for sending the detailed notes about new stuff in 2.2.0! A lot of
exciting things indeed.
Regarding Java 8: I thought our consensus was to have the release notes say that
On Mon, Dec 4, 2017 at 3:22 PM, Lukasz Cwik wrote:
> Since processing can happen out of order, for example if the input was:
> ```
> {"id": "2", parent_id: "a", "timestamp": 2, "amount": 3}
> {"id": "1", parent_id: "a", "timestamp": 1. "amount": 1}
> {"id": "1", parent_id: "a",
I agree that the proper API for enabling the use case "do something after
the data has been written" is to return a PCollection of objects where each
object represents the result of writing some identifiable subset of the
data. Then one can apply a ParDo to this PCollection, in order to "do
Since processing can happen out of order, for example if the input was:
```
{"id": "2", parent_id: "a", "timestamp": 2, "amount": 3}
{"id": "1", parent_id: "a", "timestamp": 1. "amount": 1}
{"id": "1", parent_id: "a", "timestamp": 3, "amount": 2}
```
would the output be 3 and then 5 or would you
I also believe we were still in the investigatory phase for dropping
support for Java 7.
On Mon, Dec 4, 2017 at 2:22 PM, Eugene Kirpichov
wrote:
> Thanks JB for sending the detailed notes about new stuff in 2.2.0! A lot
> of exciting things indeed.
>
> Regarding Java 8: I
Thanks JB for sending the detailed notes about new stuff in 2.2.0! A lot of
exciting things indeed.
Regarding Java 8: I thought our consensus was to have the release notes say
that we're *considering* going Java8-only, and use that to get more
opinions from the user community - but I can't find
I'm super excited about this release! Great work everyone involved!
On Mon, Dec 4, 2017 at 10:58 AM, Jean-Baptiste Onofré
wrote:
> Just an important note that we forgot to mention.
>
> !! The 2.2.0 release will be the last one supporting Spark 1.x and Java 7
> !!
>
> Starting
+1
At the very least an empty PCollection could be produced with no
promises about its contents but the ability to be followed (e.g. as a
side input), which is forward compatible with whatever actual metadata
one may decide to produce in the future.
On Mon, Dec 4, 2017 at 11:06 AM, Kenneth
It seems like your trying to use Spark 2.1.0. Apache Beam currently relies
on users using Spark 1.6.3. There is an open pull request[1] to migrate to
Spark 2.2.0.
1: https://github.com/apache/beam/pull/4208/
On Mon, Dec 4, 2017 at 10:58 AM, Opitz, Daniel A
wrote:
> We
+dev@
I am in complete agreement with Luke. Data dependencies are easy to
understand and a good way for an IO to communicate and establish causal
dependencies. Converting an IO from PDone to real output may spur further
useful thoughts based on the design decisions about what sort of output is
We are trying to submit a Spark job through YARN with the following command:
spark-submit --conf spark.yarn.stagingDir=/path/to/stage --verbose --class
com.my.class --jars /path/to/jar1,path/to/jar2 /path/to/main/jar/application.jar
The application is being populated in the YARN scheduler
Just an important note that we forgot to mention.
!! The 2.2.0 release will be the last one supporting Spark 1.x and Java 7 !!
Starting from Beam 2.3.0, the Spark runner will work only with Spark 2.x and we
will focus only Java 8.
Regards
JB
On 12/04/2017 10:15 AM, Jean-Baptiste Onofré
Thanks Reuven !
I would like to emphasize on some highlights in 2.2.0 release:
- New IOs have been introduced:
* TikaIO leveraging Apache Tika, allowing the deal with a lot of different
data formats
* RedisIO to read and write key/value pairs from a Redis server. This IO will
be soon
13 matches
Mail list logo