Re: Limitations on grouped ReduceFunction
For now, I think it is worth documenting and leaving it as it is. A while back we thought about adding a Static Code Analysis rule to find such cases and create a warning. For Reduce, that is quite straightforward, for GrouReduce quite tricky... Am 02.02.2016 21:55 schrieb "Greg Hogan": > If a user modifies keyed fields of a grouped reduce during a combine then > the reduce will receive incorrect groupings. For example, a useless > modification to word count: > > public WC reduce(WC in1, WC in2) { > return new WC(in1.word + " " + in2.word, in1.count + in2.count); > } > > I don't see an efficient means to prevent this. Is this limitation worth > documenting, or can we safely assume that no one will ever attempt this? > MapReduce also has this limitation, and Spark gets around this by > separating keys and values and only presenting values to reduce. > > "Reduce on Grouped DataSet: A Reduce transformation that is applied on a > grouped DataSet reduces each group to a single element using a user-defined > reduce function. For each group of input elements, a reduce function > successively combines pairs of elements into one element until only a > single element for each group remains." > > Greg >
Re: Connector Documentation missing
Most of the flume code is commented out IIRC? Sent from my iPhone > On Feb 3, 2016, at 8:24 AM, Matthias J. Saxwrote: > > Hi, > > I just observed that there are 7 flink-streaming-connectors available > but only 5 are documented on the web page. > > Flume and Nifi are not documented. Did we miss to extend the > documentation for both (which should have been part of the commit of the > code) or was this left out on purpose? > > -Matthias > >
Re: Connector Documentation missing
There is currently only a FlumeSink. The FlumeSource is a dummy file (copied from the AvroSource) and needs to be removed. On Wed, Feb 3, 2016 at 3:11 PM, Matthias J. Saxwrote: > FlumeSink is there. FlumeSource and FlumeTopology is all put in > comments... Not sure about it. > > There is no JIRA for a Flume connector... Does anyone know the status of > Flume connector? > > > On 02/03/2016 02:45 PM, Suneel Marthi wrote: > > Most of the flume code is commented out IIRC? > > > > Sent from my iPhone > > > >> On Feb 3, 2016, at 8:24 AM, Matthias J. Sax wrote: > >> > >> Hi, > >> > >> I just observed that there are 7 flink-streaming-connectors available > >> but only 5 are documented on the web page. > >> > >> Flume and Nifi are not documented. Did we miss to extend the > >> documentation for both (which should have been part of the commit of the > >> code) or was this left out on purpose? > >> > >> -Matthias > >> > >> > >
[jira] [Created] (FLINK-3326) Kafka Consumer doesn't retrieve messages after killing flink job which was using the same topic
Farouk Salem created FLINK-3326: --- Summary: Kafka Consumer doesn't retrieve messages after killing flink job which was using the same topic Key: FLINK-3326 URL: https://issues.apache.org/jira/browse/FLINK-3326 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 0.10.1 Environment: Java Reporter: Farouk Salem Priority: Blocker If there is a streaming job which retrieving data from a Kafka topic (from the smallest offest) and this job is killed manually, starting another job (or the same job) after the killed job will not be able to consume data from the same topic starting from smallest offset. I tried this behavior more than 10 times and each time, it failed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Connector Documentation missing
Sorry, I don't think it was purposely not documented :) It needs to be added. Could you file JIRAs and perhaps link them with the Flume/Nifi JIRAs? You could also reopen those.. On Wed, Feb 3, 2016 at 3:21 PM, Matthias J. Saxwrote: > Thanks for clarification! What about the missing documentation? > > On 02/03/2016 03:18 PM, Maximilian Michels wrote: > > There is currently only a FlumeSink. The FlumeSource is a dummy file > > (copied from the AvroSource) and needs to be removed. > > > > On Wed, Feb 3, 2016 at 3:11 PM, Matthias J. Sax > wrote: > > > >> FlumeSink is there. FlumeSource and FlumeTopology is all put in > >> comments... Not sure about it. > >> > >> There is no JIRA for a Flume connector... Does anyone know the status of > >> Flume connector? > >> > >> > >> On 02/03/2016 02:45 PM, Suneel Marthi wrote: > >>> Most of the flume code is commented out IIRC? > >>> > >>> Sent from my iPhone > >>> > On Feb 3, 2016, at 8:24 AM, Matthias J. Sax wrote: > > Hi, > > I just observed that there are 7 flink-streaming-connectors available > but only 5 are documented on the web page. > > Flume and Nifi are not documented. Did we miss to extend the > documentation for both (which should have been part of the commit of > the > code) or was this left out on purpose? > > -Matthias > > > >> > >> > > > >
[jira] [Created] (FLINK-3323) Nifi connector not documented
Matthias J. Sax created FLINK-3323: -- Summary: Nifi connector not documented Key: FLINK-3323 URL: https://issues.apache.org/jira/browse/FLINK-3323 Project: Flink Issue Type: Improvement Components: Documentation Reporter: Matthias J. Sax Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Connector Documentation missing
FlumeSink is there. FlumeSource and FlumeTopology is all put in comments... Not sure about it. There is no JIRA for a Flume connector... Does anyone know the status of Flume connector? On 02/03/2016 02:45 PM, Suneel Marthi wrote: > Most of the flume code is commented out IIRC? > > Sent from my iPhone > >> On Feb 3, 2016, at 8:24 AM, Matthias J. Saxwrote: >> >> Hi, >> >> I just observed that there are 7 flink-streaming-connectors available >> but only 5 are documented on the web page. >> >> Flume and Nifi are not documented. Did we miss to extend the >> documentation for both (which should have been part of the commit of the >> code) or was this left out on purpose? >> >> -Matthias >> >> signature.asc Description: OpenPGP digital signature
[jira] [Created] (FLINK-3325) Unclosed InputStream in CEPPatternOperator#restoreState()
Ted Yu created FLINK-3325: - Summary: Unclosed InputStream in CEPPatternOperator#restoreState() Key: FLINK-3325 URL: https://issues.apache.org/jira/browse/FLINK-3325 Project: Flink Issue Type: Bug Reporter: Ted Yu Priority: Minor Here is related code: {code} final InputStream is = stream.getState(getUserCodeClassloader()); final ObjectInputStream ois = new ObjectInputStream(is); final DataInputViewStreamWrapper div = new DataInputViewStreamWrapper(is); nfa = (NFA)ois.readObject(); {code} Neither is nor ois is closed upon return from the method -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3329) Unclosed BackupEngine in AbstractRocksDBState#snapshot()
Ted Yu created FLINK-3329: - Summary: Unclosed BackupEngine in AbstractRocksDBState#snapshot() Key: FLINK-3329 URL: https://issues.apache.org/jira/browse/FLINK-3329 Project: Flink Issue Type: Bug Reporter: Ted Yu Here is related code: {code} BackupEngine backupEngine = BackupEngine.open(Env.getDefault(), new BackupableDBOptions(localBackupPath.getAbsolutePath())); backupEngine.createNewBackup(db); {code} BackupEngine implements AutoCloseable. backupEngine should be closed upon return from the method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3330) Add SparseVector support to BLAS library in FlinkML
Chiwan Park created FLINK-3330: -- Summary: Add SparseVector support to BLAS library in FlinkML Key: FLINK-3330 URL: https://issues.apache.org/jira/browse/FLINK-3330 Project: Flink Issue Type: Improvement Components: Machine Learning Library Affects Versions: 1.0.0 Reporter: Chiwan Park Assignee: Chiwan Park An user reported the problem using {{GradientDescent}} algorithm with {{SparseVector}}. (http://mail-archives.apache.org/mod_mbox/flink-user/201602.mbox/%3CCAMJxVsiNRy_B349tuRpC%2BY%2BfyW7j2SHcyVfhqnz3BGOwEHXHpg%40mail.gmail.com%3E) It seems lack of SparseVector support in {{BLAS.axpy}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Connector Documentation missing
Hi, I just observed that there are 7 flink-streaming-connectors available but only 5 are documented on the web page. Flume and Nifi are not documented. Did we miss to extend the documentation for both (which should have been part of the commit of the code) or was this left out on purpose? -Matthias signature.asc Description: OpenPGP digital signature
[jira] [Created] (FLINK-3327) Attach the ExecutionConfig to the JobGraph and make it accessible to the AbstractInvocable.
Kostas created FLINK-3327: - Summary: Attach the ExecutionConfig to the JobGraph and make it accessible to the AbstractInvocable. Key: FLINK-3327 URL: https://issues.apache.org/jira/browse/FLINK-3327 Project: Flink Issue Type: Sub-task Reporter: Kostas Assignee: Kostas -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] Release 0.10.2
> On 02 Feb 2016, at 17:30, Robert Metzgerwrote: > > Thank you for taking care of this. +1 > > I'd like to include this in to the 0.10.2 release: > https://github.com/apache/flink/pull/1576 OK (merged to master and release-10)
[jira] [Created] (FLINK-3328) Incorrectly shaded dependencies in flink-runtime
Stephan Ewen created FLINK-3328: --- Summary: Incorrectly shaded dependencies in flink-runtime Key: FLINK-3328 URL: https://issues.apache.org/jira/browse/FLINK-3328 Project: Flink Issue Type: Bug Components: Build System Affects Versions: 1.0.0 Reporter: Stephan Ewen Priority: Blocker Fix For: 1.0.0 There are apparently some dependencies shaded into {{flink-runtime}} fat jar that are not relocated. (the flink-runtime jar is now 70 MB) >From the output of the shading in flink-dist, it looks as if this concerns at >least - Zookeeper - slf4j - jline - netty (3.x) Possible more. {code} [WARNING] zookeeper-3.4.6.jar, flink-runtime_2.10-1.0-SNAPSHOT.jar define 440 overlapping classes: [WARNING] - org.apache.zookeeper.server.NettyServerCnxnFactory [WARNING] - org.apache.jute.compiler.JFile [WARNING] - org.apache.zookeeper.server.SessionTracker$Session [WARNING] - org.apache.zookeeper.server.quorum.AuthFastLeaderElection$1 [WARNING] - org.apache.jute.compiler.JLong [WARNING] - org.apache.zookeeper.client.ZooKeeperSaslClient$SaslState [WARNING] - org.apache.zookeeper.server.auth.KerberosName$Rule [WARNING] - org.apache.jute.CsvOutputArchive [WARNING] - org.apache.zookeeper.server.quorum.QuorumPeer [WARNING] - org.apache.zookeeper.ZooKeeper$DataWatchRegistration [WARNING] - 430 more... [WARNING] slf4j-api-1.7.7.jar, flink-runtime_2.10-1.0-SNAPSHOT.jar define 24 overlapping classes: [WARNING] - org.slf4j.spi.MarkerFactoryBinder [WARNING] - org.slf4j.helpers.SubstituteLogger [WARNING] - org.slf4j.helpers.BasicMarker [WARNING] - org.slf4j.helpers.Util [WARNING] - org.slf4j.LoggerFactory [WARNING] - org.slf4j.Marker [WARNING] - org.slf4j.helpers.NamedLoggerBase [WARNING] - org.slf4j.Logger [WARNING] - org.slf4j.spi.LocationAwareLogger [WARNING] - org.slf4j.ILoggerFactory [WARNING] - 14 more... [WARNING] jansi-1.4.jar, jline-2.10.4.jar define 23 overlapping classes: [WARNING] - org.fusesource.jansi.Ansi$Erase [WARNING] - org.fusesource.jansi.Ansi [WARNING] - org.fusesource.jansi.AnsiOutputStream [WARNING] - org.fusesource.jansi.internal.CLibrary [WARNING] - org.fusesource.jansi.Ansi$2 [WARNING] - org.fusesource.jansi.WindowsAnsiOutputStream [WARNING] - org.fusesource.jansi.AnsiRenderer$Code [WARNING] - org.fusesource.jansi.AnsiConsole [WARNING] - org.fusesource.jansi.Ansi$Attribute [WARNING] - org.fusesource.jansi.internal.Kernel32 [WARNING] - 13 more... [WARNING] commons-beanutils-core-1.8.0.jar, commons-collections-3.2.2.jar, commons-beanutils-1.7.0.jar define 10 overlapping classes: [WARNING] - org.apache.commons.collections.FastHashMap$EntrySet [WARNING] - org.apache.commons.collections.ArrayStack [WARNING] - org.apache.commons.collections.FastHashMap$1 [WARNING] - org.apache.commons.collections.FastHashMap$KeySet [WARNING] - org.apache.commons.collections.FastHashMap$CollectionView [WARNING] - org.apache.commons.collections.BufferUnderflowException [WARNING] - org.apache.commons.collections.Buffer [WARNING] - org.apache.commons.collections.FastHashMap$CollectionView$CollectionViewIterator [WARNING] - org.apache.commons.collections.FastHashMap$Values [WARNING] - org.apache.commons.collections.FastHashMap [WARNING] flink-streaming-scala_2.10-1.0-SNAPSHOT.jar, flink-core-1.0-SNAPSHOT.jar, flink-runtime_2.10-1.0-SNAPSHOT.jar, flink-java-1.0-SNAPSHOT.jar, flink-streaming-java_2.10-1.0-SNAPSHOT.jar, flink-scala_2.10-1.0-SNAPSHOT.jar, flink-clients_2.10-1.0-SNAPSHOT.jar, flink-optimizer_2.10-1.0-SNAPSHOT.jar, flink-runtime-web_2.10-1.0-SNAPSHOT.jar define 1690 overlapping classes: [WARNING] - org.apache.flink.shaded.com.google.common.collect.LinkedListMultimap [WARNING] - org.apache.flink.shaded.com.google.common.io.ByteSource$AsCharSource [WARNING] - org.apache.flink.shaded.com.google.common.escape.Platform [WARNING] - org.apache.flink.shaded.com.google.common.util.concurrent.Futures$ImmediateFailedCheckedFuture [WARNING] - org.apache.flink.shaded.com.google.common.primitives.SignedBytes$LexicographicalComparator [WARNING] - org.apache.flink.shaded.com.google.common.cache.LocalCache$WriteQueue$2 [WARNING] - org.apache.flink.shaded.com.google.common.escape.Escaper$1 [WARNING] - org.apache.flink.shaded.com.google.common.collect.MultimapBuilder$SetMultimapBuilder [WARNING] - org.apache.flink.shaded.com.google.common.collect.Ordering$ArbitraryOrdering [WARNING] - org.apache.flink.shaded.com.google.common.collect.Synchronized$SynchronizedAsMapEntries$1 [WARNING] - 1680 more... [WARNING] flink-scala_2.10-1.0-SNAPSHOT.jar, flink-java-1.0-SNAPSHOT.jar, flink-streaming-scala_2.10-1.0-SNAPSHOT.jar, flink-runtime_2.10-1.0-SNAPSHOT.jar define 25 overlapping classes: [WARNING] - org.apache.flink.shaded.org.objectweb.asm.Context