Greg Hogan created FLINK-4217:
-
Summary: Gelly drivers should read CSV values as strings
Key: FLINK-4217
URL: https://issues.apache.org/jira/browse/FLINK-4217
Project: Flink
Issue Type
Hi Do,
DataSet provides a stable @Public interface. DataSetUtils is marked
@PublicEvolving which is intended for public use, has stable behavior, but
method signatures may change. It's also good to limit DataSet to common
methods whereas the utility methods tend to be used for specific
Greg Hogan created FLINK-4172:
-
Summary: Don't proxy a ProxiedObject
Key: FLINK-4172
URL: https://issues.apache.org/jira/browse/FLINK-4172
Project: Flink
Issue Type: Bug
Components
ully
> >>>> merged today)
> >>>> https://github.com/apache/flink/pull/2158
> >>>>
> >>>> In regards to metrics: To add a counter metric a user currently has
> to call
> >>>> "counter(...)" on
> >>>> a MetricGro
It would be great if hash-based combine (FLINK-3477) could make it in to be
tested for this release. We've seen impressive improvements in performance
(though, admittedly, some sort-based enhancements are yet to be worked on).
This PR looks to be ripe.
Also, as we tidy up a few things with Gelly
Greg Hogan created FLINK-4135:
-
Summary: Replace ChecksumHashCode as GraphAnalytic
Key: FLINK-4135
URL: https://issues.apache.org/jira/browse/FLINK-4135
Project: Flink
Issue Type: Improvement
Greg Hogan created FLINK-4132:
-
Summary: Fix boxed comparison in CommunityDetection algorithm
Key: FLINK-4132
URL: https://issues.apache.org/jira/browse/FLINK-4132
Project: Flink
Issue Type: Bug
Greg Hogan created FLINK-4129:
-
Summary: HITSAlgorithm should test for element-wise convergence
Key: FLINK-4129
URL: https://issues.apache.org/jira/browse/FLINK-4129
Project: Flink
Issue Type
Greg Hogan created FLINK-4117:
-
Summary: Wait for CuratorFramework connection to be established
Key: FLINK-4117
URL: https://issues.apache.org/jira/browse/FLINK-4117
Project: Flink
Issue Type
Greg Hogan created FLINK-4113:
-
Summary: Always copy first value in ChainedAllReduceDriver
Key: FLINK-4113
URL: https://issues.apache.org/jira/browse/FLINK-4113
Project: Flink
Issue Type: Bug
Greg Hogan created FLINK-4106:
-
Summary: Restructure Gelly docs
Key: FLINK-4106
URL: https://issues.apache.org/jira/browse/FLINK-4106
Project: Flink
Issue Type: Improvement
Components
Greg Hogan created FLINK-4107:
-
Summary: Restructure Gelly docs
Key: FLINK-4107
URL: https://issues.apache.org/jira/browse/FLINK-4107
Project: Flink
Issue Type: Improvement
Components
Greg Hogan created FLINK-4105:
-
Summary: Restructure Gelly docs
Key: FLINK-4105
URL: https://issues.apache.org/jira/browse/FLINK-4105
Project: Flink
Issue Type: Improvement
Components
Greg Hogan created FLINK-4104:
-
Summary: Restructure Gelly docs
Key: FLINK-4104
URL: https://issues.apache.org/jira/browse/FLINK-4104
Project: Flink
Issue Type: Improvement
Components
Greg Hogan created FLINK-4102:
-
Summary: Test failure with checkpoint barriers
Key: FLINK-4102
URL: https://issues.apache.org/jira/browse/FLINK-4102
Project: Flink
Issue Type: Bug
Is "Observer" too passive?
Maintainer -> Guide and/or Shepherd -> Reviewer?
Are the component leads the first name in each list? If so, +1 from me :)
On Wed, Jun 1, 2016 at 1:59 PM, Chesnay Schepler wrote:
> sounds like "Observer" would fit.
>
>
> On 01.06.2016 19:11,
Greg Hogan created FLINK-4003:
-
Summary: Use intrinsics for MathUtils logarithms
Key: FLINK-4003
URL: https://issues.apache.org/jira/browse/FLINK-4003
Project: Flink
Issue Type: Improvement
Greg Hogan created FLINK-3997:
-
Summary: PRNG Skip-ahead
Key: FLINK-3997
URL: https://issues.apache.org/jira/browse/FLINK-3997
Project: Flink
Issue Type: Improvement
Components: Gelly
Hi Stephan,
Is there a design document, prior discussion, or background material on
this enhancement? Am I correct in understanding that this only applies to
DataSet since streams run indefinitely?
Thanks,
Greg
On Mon, May 30, 2016 at 5:49 PM, Stephan Ewen wrote:
> Hi Eron!
Hi Simone,
This can be done with a map followed by a reduce. DataSet#count leverages
accumulators which perform an inherent reduce. Also, DataSet#count
implements RichOutputFormat as an optimization to only require a single
operator. Previously the counting and accumulating was handled in a
13 Ufuk Celebi
9 Fabian Hueske
9 Maximilian Michels
6 Greg Hogan
5 Stefano Baghino
3 smarthi
2 Andrea Sella
2 Gyula Fora
2 Jun Aoki
2 Sachin Goel
2 mjsax
2 zentol
1 Alexander Alexandrov
1 Gabor Gevay
1 Prez Cannady
Hi y'all,
I think this is an oft-requested feature [0] and there are many graph
algorithms for which intermediate output is the desired result. I'd like to
take Stephan up on his offer [1] for pointers.
I have yet to get in deep, but I see that iteration tasks are treated
specially as
Greg Hogan created FLINK-3980:
-
Summary: Remove ExecutionConfig.PARALLELISM_UNKNOWN
Key: FLINK-3980
URL: https://issues.apache.org/jira/browse/FLINK-3980
Project: Flink
Issue Type: Improvement
Greg Hogan created FLINK-3978:
-
Summary: Add contains methods to RuntimeContext
Key: FLINK-3978
URL: https://issues.apache.org/jira/browse/FLINK-3978
Project: Flink
Issue Type: Improvement
Greg Hogan created FLINK-3965:
-
Summary: Delegating GraphAlgorithm
Key: FLINK-3965
URL: https://issues.apache.org/jira/browse/FLINK-3965
Project: Flink
Issue Type: New Feature
Greg Hogan created FLINK-3925:
-
Summary: GraphAlgorithm to filter by maximum degree
Key: FLINK-3925
URL: https://issues.apache.org/jira/browse/FLINK-3925
Project: Flink
Issue Type: New Feature
I also just modify the startup scripts but would it be better to have
variants of env.java.opts specific to the JobManager, TaskManager, client,
etc.?
On Tue, May 17, 2016 at 5:24 AM, Stephan Ewen wrote:
> Hey Stefano!
>
> I think that question is bound to come up again. I
Hi,
This question has arisen with the HITS algorithm (Hubs and Authorities) but
the question is the same as with PageRank, for which Stephan published an
excellent discussion and comparison of bulk and delta iterations [0].
Delta iterations are clearly faster. Has there been a comparison as to
Greg Hogan created FLINK-3910:
-
Summary: New self-join operator
Key: FLINK-3910
URL: https://issues.apache.org/jira/browse/FLINK-3910
Project: Flink
Issue Type: New Feature
Components
+1 to better scaling :)
Many Jira tickets are good ideas with no current traction. Some have a pull
request (usually closed), many have comments or discussion. It seems these
old tickets tend to hang around because closing the ticket feels like
rejecting the idea. How do we track requested
Greg Hogan created FLINK-3907:
-
Summary: Directed Clustering Coefficient
Key: FLINK-3907
URL: https://issues.apache.org/jira/browse/FLINK-3907
Project: Flink
Issue Type: New Feature
Greg Hogan created FLINK-3906:
-
Summary: Global Clustering Coefficient
Key: FLINK-3906
URL: https://issues.apache.org/jira/browse/FLINK-3906
Project: Flink
Issue Type: New Feature
Greg Hogan created FLINK-3898:
-
Summary: Adamic-Adar Similarity
Key: FLINK-3898
URL: https://issues.apache.org/jira/browse/FLINK-3898
Project: Flink
Issue Type: New Feature
Components
Greg Hogan created FLINK-3879:
-
Summary: Native implementation of HITS algorithm
Key: FLINK-3879
URL: https://issues.apache.org/jira/browse/FLINK-3879
Project: Flink
Issue Type: New Feature
Greg Hogan created FLINK-3877:
-
Summary: Create TranslateFunction interface for Graph translators
Key: FLINK-3877
URL: https://issues.apache.org/jira/browse/FLINK-3877
Project: Flink
Issue Type
Greg Hogan created FLINK-3865:
-
Summary: ExecutionConfig NullPointerException with second execution
Key: FLINK-3865
URL: https://issues.apache.org/jira/browse/FLINK-3865
Project: Flink
Issue
Greg Hogan created FLINK-3853:
-
Summary: Reduce object creation in Gelly utility mappers
Key: FLINK-3853
URL: https://issues.apache.org/jira/browse/FLINK-3853
Project: Flink
Issue Type
Greg Hogan created FLINK-3845:
-
Summary: Gelly allows duplicate vertices in Graph.addVertices
Key: FLINK-3845
URL: https://issues.apache.org/jira/browse/FLINK-3845
Project: Flink
Issue Type: Bug
Matthias,
Won't this be a compile-time error as long as the user is parameterizing
the return type since .fromElements(OUT...) returns DataStreamSource
and will bind to the nearest common superclass? The new
.fromElements(Class, OUT...) does give the user the choice of common
superclass.
Greg
We have also started running over Travis' 2 hour limit for the longest build.
Greg
> On Apr 27, 2016, at 7:53 AM, Ufuk Celebi wrote:
>
> Hi Till,
>
> thank you for bringing this up. We really need to fix this.
>
> Filing JIRAs with critical priority was how we tried to
se we don't know the number of sub tasks yet. In
> the latter case, which can also be cause by large closure objects, we
> should send the job via the blob manager to the `JobManager` to solve the
> problem.
>
> Cheers,
> Till
>
> On Mon, Apr 25, 2016 at 3:45 PM, Greg Ho
Hi,
CollectionInputFormat currently enforces a parallelism of 1 by implementing
NonParallelInput and serializing the entire Collection. If my understanding
is correct this serialized InputFormat is often the cause of a new job
exceeding the akka message size limit.
As an alternative the
Vasia and I are looking for additional feedback on FLINK-3772. This ticket
[0] and PR [1] provides a set of graph algorithms which compute and store
the degree for vertices and edges.
Degree annotation is a basic component of many algorithms. For example,
PageRank requires the vertex out-degree
Vasia and I are looking for additional feedback on FLINK-3771. This ticket
[0] and PR [1] provides methods for translating the type or value of graph
labels, vertex values, and edge values. My use cases are provided in JIRA,
but I think users will find many more.
Translators compose well with
Greg Hogan created FLINK-3799:
-
Summary: Graph checksum should execute single job
Key: FLINK-3799
URL: https://issues.apache.org/jira/browse/FLINK-3799
Project: Flink
Issue Type: Improvement
Greg Hogan created FLINK-3789:
-
Summary: Overload methods which trigger program execution to allow
naming job
Key: FLINK-3789
URL: https://issues.apache.org/jira/browse/FLINK-3789
Project: Flink
Greg Hogan created FLINK-3780:
-
Summary: Jaccard Similarity
Key: FLINK-3780
URL: https://issues.apache.org/jira/browse/FLINK-3780
Project: Flink
Issue Type: New Feature
Components
Greg Hogan created FLINK-3772:
-
Summary: Graph algorithms for vertex and edge degree
Key: FLINK-3772
URL: https://issues.apache.org/jira/browse/FLINK-3772
Project: Flink
Issue Type: New Feature
Greg Hogan created FLINK-3771:
-
Summary: Methods for translating Graphs
Key: FLINK-3771
URL: https://issues.apache.org/jira/browse/FLINK-3771
Project: Flink
Issue Type: Improvement
Greg Hogan created FLINK-3768:
-
Summary: Clustering Coefficient
Key: FLINK-3768
URL: https://issues.apache.org/jira/browse/FLINK-3768
Project: Flink
Issue Type: New Feature
Components
Greg Hogan created FLINK-3721:
-
Summary: Min and max accumulators
Key: FLINK-3721
URL: https://issues.apache.org/jira/browse/FLINK-3721
Project: Flink
Issue Type: New Feature
I'd like to discuss the creation of a macro-benchmarking module for Flink.
This could be run during pre-release testing to detect performance
regressions and during development when refactoring or performance tuning
code on the hot path.
Many users have published benchmarks and the Flink
Greg Hogan created FLINK-3695:
-
Summary: ValueArray types
Key: FLINK-3695
URL: https://issues.apache.org/jira/browse/FLINK-3695
Project: Flink
Issue Type: New Feature
Components: Core
build at:
https://s3.amazonaws.com/apache-flink/flink-1.1-SNAPSHOT.txz
Are you able to replicate with the following command:
$ ./bin/flink run -c org.apache.flink.graph.examples.Graph500
flink-gelly_with_examples_2.10-1.1-SNAPSHOT.jar
On Tue, Mar 15, 2016 at 5:16 PM, Greg Hogan &l
Greg Hogan created FLINK-3623:
-
Summary: Adjust MurmurHash algorithm
Key: FLINK-3623
URL: https://issues.apache.org/jira/browse/FLINK-3623
Project: Flink
Issue Type: Improvement
Greg Hogan created FLINK-3634:
-
Summary: Fix documentation for DataSetUtils.zipWithUniqueId()
Key: FLINK-3634
URL: https://issues.apache.org/jira/browse/FLINK-3634
Project: Flink
Issue Type
example program with us which reproduces the problem? I
> suspect that, somehow, your user code class BlockInfo is sent directly to
> the JobManager where it is deserialized without the user code class loader.
>
> Cheers,
> Till
>
>
> On Tue, Mar 15, 2016 at 4:19 PM, Greg Hogan
I am seeing a failure running my code starting with commit 0f8d76c6
(ExecutionConfig to JobGraph).
Logs and stack trace are below.
Using default configuration so a single TaskManager. From the web UI, data
port is 33245 and path is akka.tcp://
flink@192.168.14.134:41339/user/taskmanager.
>
> I have to dig into the serializers, to see if they could suffer from that.
> The "getField(pos)" method for example should always have many overrides
> (though few would be loaded at any time, because one usually does not use
> all Tuple classes at the same time).
>
I am noticing what looks like the same drop-off in performance when
introducing TupleN subclasses as expressed in "Understanding the JIT and
tuning the implementation" [1].
I start my single-node cluster, run an algorithm which relies purely on
Tuples, and measure the runtime. I execute a
to express their wish for fast
> resolution.
> > I also saw some cases where issues were reopened.
> >
> > I agree with your suggestion to clear the "fix version" field once 1.0.0
> > has been released.
> >
> > On Mon, Feb 22, 2016 at 4:43 PM, Greg Hogan
Hi,
I have two bugfix pull requests in the stack.
[FLINK-3340] [runtime] Fix object juggling in drivers
https://github.com/apache/flink/pull/1626
[FLINK-3437] [web-dashboard] Fix UI router state for job plan
https://github.com/apache/flink/pull/1661
Greg
On Thu, Feb 25, 2016 at 8:32 AM,
Hi Vasia,
In the WebUI, the Subtasks and TaskManagers list the same operator
statistics but expand to show either per-subtask or per-TaskManager
statistics. Summarizing the statistics by TaskManager is valuable when
viewing larger clusters.
Greg
On Thu, Feb 25, 2016 at 11:23 AM, Vasiliki
Greg Hogan created FLINK-3469:
-
Summary: Fix documentation for grouping keys
Key: FLINK-3469
URL: https://issues.apache.org/jira/browse/FLINK-3469
Project: Flink
Issue Type: Bug
Greg Hogan created FLINK-3467:
-
Summary: Remove superfluous objects from DataSourceTask.invoke
Key: FLINK-3467
URL: https://issues.apache.org/jira/browse/FLINK-3467
Project: Flink
Issue Type
Hi,
With 1.0.0 imminent there are 112 tickets with a "fix version" of 1.0.0,
the earliest from 2014. From the ticket logs it looks like we typically
bump the fix version once the target release has passed. Would it be better
to wait to assign a fix version until achieving some combination of
Greg Hogan created FLINK-3454:
-
Summary: Add test dependencies on packaged jars
Key: FLINK-3454
URL: https://issues.apache.org/jira/browse/FLINK-3454
Project: Flink
Issue Type: Bug
Greg Hogan created FLINK-3453:
-
Summary: Fix TaskManager logs exception when sampling backpressure
while task completes
Key: FLINK-3453
URL: https://issues.apache.org/jira/browse/FLINK-3453
Project
Greg Hogan created FLINK-3447:
-
Summary: Package Gelly algorithms by framework
Key: FLINK-3447
URL: https://issues.apache.org/jira/browse/FLINK-3447
Project: Flink
Issue Type: Improvement
Hi Fabian,
I would only add to your citations Stephan's comment [1] concerning the
design, implementation, and use of object reuse.
I see two separate concerns addressed in code. First, as Stephan noted, for
certain classes deserialization is sufficiently expensive relative to
object creation
Greg Hogan created FLINK-3437:
-
Summary: Fix UI router state for job plan
Key: FLINK-3437
URL: https://issues.apache.org/jira/browse/FLINK-3437
Project: Flink
Issue Type: Sub-task
Greg Hogan created FLINK-3393:
-
Summary: ExternalProcessRunner wait to finish copying error stream
Key: FLINK-3393
URL: https://issues.apache.org/jira/browse/FLINK-3393
Project: Flink
Issue Type
Greg Hogan created FLINK-3395:
-
Summary: Polishing the web UI
Key: FLINK-3395
URL: https://issues.apache.org/jira/browse/FLINK-3395
Project: Flink
Issue Type: Improvement
Components
Greg Hogan created FLINK-3385:
-
Summary: Fix outer join skipping unprobed partitions
Key: FLINK-3385
URL: https://issues.apache.org/jira/browse/FLINK-3385
Project: Flink
Issue Type: Bug
Greg Hogan created FLINK-3382:
-
Summary: Improve clarity of object reuse in
ReusingMutableToRegularIteratorWrapper
Key: FLINK-3382
URL: https://issues.apache.org/jira/browse/FLINK-3382
Project: Flink
Is it possible to force operator chaining to be disabled? Similar to how
object reuse can be enabled or disabled?
Greg
)
>
> On Mon, Feb 8, 2016 at 10:34 AM, Greg Hogan <c...@greghogan.com> wrote:
>
> > Is it possible to force operator chaining to be disabled? Similar to how
> > object reuse can be enabled or disabled?
> >
> > Greg
> >
>
Greg Hogan created FLINK-3340:
-
Summary: Fix object juggling in reduce drivers
Key: FLINK-3340
URL: https://issues.apache.org/jira/browse/FLINK-3340
Project: Flink
Issue Type: Bug
Greg Hogan created FLINK-3335:
-
Summary: DataSourceTask object reuse when disabled
Key: FLINK-3335
URL: https://issues.apache.org/jira/browse/FLINK-3335
Project: Flink
Issue Type: Bug
Greg Hogan created FLINK-3337:
-
Summary: mvn test fails on flink-runtime because curator classes
not found
Key: FLINK-3337
URL: https://issues.apache.org/jira/browse/FLINK-3337
Project: Flink
If a user modifies keyed fields of a grouped reduce during a combine then
the reduce will receive incorrect groupings. For example, a useless
modification to word count:
public WC reduce(WC in1, WC in2) {
return new WC(in1.word + " " + in2.word, in1.count + in2.count);
}
I don't see an
Greg Hogan created FLINK-3277:
-
Summary: Use Value types in Gelly API
Key: FLINK-3277
URL: https://issues.apache.org/jira/browse/FLINK-3277
Project: Flink
Issue Type: Improvement
Greg Hogan created FLINK-3279:
-
Summary: Optionally disable DistinctOperator combiner
Key: FLINK-3279
URL: https://issues.apache.org/jira/browse/FLINK-3279
Project: Flink
Issue Type: New Feature
Greg Hogan created FLINK-3263:
-
Summary: Log task statistics on TaskManager
Key: FLINK-3263
URL: https://issues.apache.org/jira/browse/FLINK-3263
Project: Flink
Issue Type: Improvement
Greg Hogan created FLINK-3262:
-
Summary: Remove fuzzy versioning from Bower dependencies
Key: FLINK-3262
URL: https://issues.apache.org/jira/browse/FLINK-3262
Project: Flink
Issue Type
Happy Friday,
I am looking to submit a pull request for FLINK-3160 which updates files in
flink-runtime-web. What is the proper way to handle updated dependencies
from bower.json? For example, bootstrap is specified with version "~3.3.5"
which permits the patch update to 3.3.6. When I run `npm
Greg Hogan created FLINK-3219:
-
Summary: Implement DataSet.count using a single operator
Key: FLINK-3219
URL: https://issues.apache.org/jira/browse/FLINK-3219
Project: Flink
Issue Type
Greg Hogan created FLINK-3218:
-
Summary: Merging Hadoop configurations overrides user parameters
Key: FLINK-3218
URL: https://issues.apache.org/jira/browse/FLINK-3218
Project: Flink
Issue Type
Greg Hogan created FLINK-3206:
-
Summary: Heap size for non-pre-allocated off-heap memory
Key: FLINK-3206
URL: https://issues.apache.org/jira/browse/FLINK-3206
Project: Flink
Issue Type: Bug
he network buffers to be re-used by
Netty and save half of the network buffer memory? I created FLINK-3164
which would reduce the number of necessary network buffers.
Greg Hogan
On Fri, Oct 30, 2015 at 12:33 PM, Till Rohrmann <trohrm...@apache.org>
wrote:
> The logging of the TaskManager sto
Greg Hogan created FLINK-3160:
-
Summary: Aggregate operator statistics by TaskManager
Key: FLINK-3160
URL: https://issues.apache.org/jira/browse/FLINK-3160
Project: Flink
Issue Type: Improvement
Greg Hogan created FLINK-3161:
-
Summary: Externalize cluster start-up and tear-down when available
Key: FLINK-3161
URL: https://issues.apache.org/jira/browse/FLINK-3161
Project: Flink
Issue Type
Greg Hogan created FLINK-3162:
-
Summary: Configure number of TaskManager slots as ratio of
available processors
Key: FLINK-3162
URL: https://issues.apache.org/jira/browse/FLINK-3162
Project: Flink
Greg Hogan created FLINK-3164:
-
Summary: Spread out scheduling strategy
Key: FLINK-3164
URL: https://issues.apache.org/jira/browse/FLINK-3164
Project: Flink
Issue Type: Improvement
Greg Hogan created FLINK-3163:
-
Summary: Configure Flink for NUMA systems
Key: FLINK-3163
URL: https://issues.apache.org/jira/browse/FLINK-3163
Project: Flink
Issue Type: Improvement
I am testing again on a 64 node cluster (the JobManager is running fine
having reduced some operator's parallelism and fixed the string conversion
performance).
I am seeing TaskManagers drop like flies every other job or so. I am not
seeing any output in the .out log files corresponding to the
I recently discovered that AWS uses NUMA for its largest nodes. An example
c4.8xlarge:
$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 18 19 20 21 22 23 24 25 26
node 0 size: 29813 MB
node 0 free: 24537 MB
node 1 cpus: 9 10 11 12 13 14 15 16 17 27 28 29 30 31 32 33 34
connection to the
> JobManager?
>
> Greetings,
> Stephan
>
>
> On Thu, Oct 29, 2015 at 9:56 AM, Greg Hogan <c...@greghogan.com> wrote:
>
> > I recently discovered that AWS uses NUMA for its largest nodes. An
> example
> > c4.8xlarge:
> >
> > $ numa
Greg Hogan created FLINK-2909:
-
Summary: Gelly Graph Generators
Key: FLINK-2909
URL: https://issues.apache.org/jira/browse/FLINK-2909
Project: Flink
Issue Type: New Feature
Components
Greg Hogan created FLINK-2908:
-
Summary: Web interface redraw web plan when browser resized
Key: FLINK-2908
URL: https://issues.apache.org/jira/browse/FLINK-2908
Project: Flink
Issue Type
201 - 300 of 324 matches
Mail list logo