[ 
https://issues.apache.org/jira/browse/FLINK-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703841#comment-14703841
 ] 

Chesnay Schepler edited comment on FLINK-2541 at 8/19/15 10:20 PM:
-------------------------------------------------------------------

I did some digging and found the following:

within createComparator() there are effectively 2 cases that are considered:
# we are looking at a tuple that could contain the key
# we are looking at an AtomicType that is the key

The problem appears to be that when the key is a byte[], the second cases 
doesn't apply because PrimitiveArrayTypeInfo does not implement AtomicType. 
This causes the field to be skipped, effectively ending with 0 comparison 
fields set, and the exception being set off.

Soooo it appears that you can't group on a byte[], which blocks FLINK-2501. 
Probably also need an GenericArrayComparator aswell...



was (Author: zentol):
I did some digging and found the following:

within createComparator() there are effectively 2 cases that are considered:
# we are looking at a tuple that could contain the key
# we are looking at an AtomicType that is the key

The problem appears to be that when the key is a byte[], none of these two 
cases apply because PrimitiveArrayTypeInfo does not implement AtomicType. This 
causes the field to be skipped, effectively ending with 0 comparison fields 
set, and the exception being set off.

Soooo it appears that you can't group on a byte[], which blocks FLINK-2501. 
Probably also need an GenericArrayComparator aswell...


> TypeComparator creation fails for T2<T1<byte[]>, byte[]>
> --------------------------------------------------------
>
>                 Key: FLINK-2541
>                 URL: https://issues.apache.org/jira/browse/FLINK-2541
>             Project: Flink
>          Issue Type: Bug
>          Components: Java API
>            Reporter: Chesnay Schepler
>
> When running the following job as a JavaProgramTest:
> {code}
> ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
> DataSet<Tuple2<Tuple1<byte[]>, byte[]>> data = env.fromElements(
>       new Tuple2<Tuple1<byte[]>, byte[]>(
>               new Tuple1<byte[]>(new byte[]{1, 2}), 
>               new byte[]{1, 2, 3}),
>       new Tuple2<Tuple1<byte[]>, byte[]>(
>               new Tuple1<byte[]>(new byte[]{1, 2}), 
>               new byte[]{1, 2, 3}));
> data.groupBy("f0.f0")
>     .reduceGroup(new DummyReduce<Tuple2<Tuple1<byte[]>, byte[]>>())
>     .print();
> {code}
> with DummyReduce defined as
> {code}
> public static class DummyReduce<IN> implements GroupReduceFunction<IN, IN> {
> @Override
> public void reduce(Iterable<IN> values, Collector<IN> out) throws Exception {
>       for (IN value : values) {
>               out.collect(value);
>       }}}
> {code}
> i encountered the following exception:
> Tuple comparator creation has a bug
> java.lang.IllegalArgumentException: Tuple comparator creation has a bug
>       at 
> org.apache.flink.api.java.typeutils.TupleTypeInfo.getNewComparator(TupleTypeInfo.java:131)
>       at 
> org.apache.flink.api.common.typeutils.CompositeType.createComparator(CompositeType.java:133)
>       at 
> org.apache.flink.api.common.typeutils.CompositeType.createComparator(CompositeType.java:122)
>       at 
> org.apache.flink.api.common.operators.base.GroupReduceOperatorBase.getTypeComparator(GroupReduceOperatorBase.java:155)
>       at 
> org.apache.flink.api.common.operators.base.GroupReduceOperatorBase.executeOnCollections(GroupReduceOperatorBase.java:184)
>       at 
> org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:236)
>       at 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:143)
>       at 
> org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:215)
>       at 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:143)
>       at 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:125)
>       at 
> org.apache.flink.api.common.operators.CollectionExecutor.executeDataSink(CollectionExecutor.java:176)
>       at 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:152)
>       at 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:125)
>       at 
> org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:109)
>       at 
> org.apache.flink.api.java.CollectionEnvironment.execute(CollectionEnvironment.java:33)
>       at 
> org.apache.flink.test.util.CollectionTestEnvironment.execute(CollectionTestEnvironment.java:35)
>       at 
> org.apache.flink.test.util.CollectionTestEnvironment.execute(CollectionTestEnvironment.java:30)
>       at org.apache.flink.api.java.DataSet.collect(DataSet.java:408)
>       at org.apache.flink.api.java.DataSet.print(DataSet.java:1349)
>       at 
> org.apache.flink.languagebinding.api.java.python.AbstractPythonTest.testProgram(AbstractPythonTest.java:42)
>       at 
> org.apache.flink.test.util.JavaProgramTestBase.testJobCollectionExecution(JavaProgramTestBase.java:226)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>       at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>       at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>       at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>       at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>       at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>       at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>       at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
>       at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>       at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to