Do the records have another attribute Z which joins them all together?
Are the set of attributes A, B, C, X, Y, K, L are from a fixed set of
values like enums or can be mapped onto a certain number of states (like an
attribute A > 50 can be mapped onto a state "exceeds threshold")?
For your
Can you provide more details about the problem your trying to solve with
some examples showing input and the expected output?
On Wed, Nov 30, 2016 at 11:08 PM, Manu Zhang
wrote:
> Hi,
>
> Recently I’m addressing a problem where users want to trigger after
> watermark
ould also work with certain string
> representations of a timestamp.
>
> Feel free to ping me with any questions about the sorter.
>
> Thanks,
> Mitch
>
> On Tue, Nov 22, 2016 at 7:55 AM, Lukasz Cwik <lc...@google.com> wrote:
>
>> Since the only guarantee for a uni
fice if I could partition the output with a custom
> partition function (for example daywise)…
>
>
>
> Thanks, Rico.
>
>
>
> *Von:* Lukasz Cwik [mailto:lc...@google.com]
> *Gesendet:* Dienstag, 22. November 2016 16:00
> *An:* user@beam.inc
There is not explicit support for sorting in the Beam model today because
the problem space is large and typically the usecases people have generally
suffice to do a global combine and sort in memory or do a combine per key
with a radix like scheme and sort each radix individually.
Can you give
; Cheers!
>
> Matthias
>
> On Tue, Nov 22, 2016 at 2:14 PM, Lukasz Cwik <lc...@google.com> wrote:
>
>> The javadoc goes through a lengthy explanation:
>> http://beam.incubator.apache.org/documentation/sdks/javadoc/
>> 0.3.0-incubating/org/apache/beam/sdk/tran
Are you asking about python or java?
On Thu, Nov 10, 2016 at 3:56 AM, 陈竞 wrote:
> i want to use beam in my company, i want to know that whether beam's core
> is stable,
> and when will the beam's first stable version release?
>
> thanks.
>
of them have the
> requirements ?
>
> On Sat, Oct 29, 2016 at 12:06 AM Lukasz Cwik <lc...@google.com> wrote:
>
>> For it to be considered a combiner, the function needs to be associative
>> and commutative.
>>
>> The issue is that from an API perspective it would be
For 2)
Depending on the runner, the Iterable inside the KV may
not be materialized in memory.
For example, in Dataflow, the Iterable is implemented as a pointer which
you walk which loads and caches blocks of data to prevent the out of memory
issue that is common for keys with lots of
The Spark runner currently only supports a limited set of sources because
there is no support in the Spark runner implementation which handles the
Read PTransform directly.
They are able to support TextIO, AvroIO and a few others by manually
providing a few direct implementations.
The supported
Amir, it seems like your attempting to build a network simulation (
https://en.wikipedia.org/wiki/Network_simulation). Are you sure Apache Beam
is the right tool for this?
On Wed, Aug 24, 2016 at 3:54 PM, Thomas Groh wrote:
> The Beam model generally is agnostic to the rate at
I was under the impression that we had several @RunnableOnService
integration tests that executed across runners.
Also, doesn't WordCount works on the DirectRunner, Flink and Dataflow? (
Java 8 streams conceptually follow parts of the big data processing model
that came out of map reduce.
Java 8 streams are limited to in memory transformations but Beam
abstracts away the datasets and allows for arbitrarily large
transformations, think petabytes of data so I believe that comparing
oo.com>
>
> *Sent:* Wednesday, June 22, 2016 2:18 AM
>
> *Subject:* Re: Multi-threading implementation equivalence in Beam
>
> Lukasz is of course correct in assuming that Flink does nothing to
> synchronize accesses to Redis (or any other external system, for that
You should be able to provide a fixed number of iterations with a for like
loop that is "unrolled" constructing a larger pipeline which will allow you
to submit the pipeline less often.
On Wed, Jun 22, 2016 at 1:10 PM, Frances Perry wrote:
> Correct -- Beam pipelines currently
both nodes
> DoFn processes) & a concurrentHashMap for another set of data
> I assume FlinkCluster maintains the thread safety of Redis &
> concurrentHashMap objects.
> Is this the right assumption? .
> Thanks again.
> Amir-
>
>
> --
> *F
g in-memory db such as Redis or Aerospike for intermediate look
> ups etc.
> Is this what the above statement referring to: dont use in-memory dbs?
> Thanks again.
>
>
> --
> *From:* Lukasz Cwik <lc...@google.com>
> *To:* user@beam.incubator.a
17 matches
Mail list logo