[ https://issues.apache.org/jira/browse/DATAFU-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106894#comment-16106894 ]
Eyal Allweil commented on DATAFU-83: ------------------------------------ Hi Kyle ([~ItsAUsernameRight?]) Your help is very welcome. I have two comments about the state of the contribution - I'll put them both here and in the review board for maximum visibility. 1. I think the output schema of this UDF is always boolean, not the schema of the first input field. I would make the outputSchema method identical to that in an existing Boolean UDF - for example, [Pig's ENDSWITH built-in function|https://github.com/apache/pig/blob/trunk/src/org/apache/pig/builtin/ENDSWITH.java#L62] 2. As Matthew already wrote in the review board, adding a case to the unit test is a good idea - you can probably just duplicate something from [the existing test|https://github.com/apache/incubator-datafu/blob/master/datafu-pig/src/test/java/datafu/test/pig/util/InTests.java]. Thanks! > InUDF does not validate that types are compatible > ------------------------------------------------- > > Key: DATAFU-83 > URL: https://issues.apache.org/jira/browse/DATAFU-83 > Project: DataFu > Issue Type: Improvement > Reporter: Matthew Hayes > Priority: Minor > Attachments: DATAFU-83.patch, rb36702.patch > > > See the example below. The input data is a long, but ints are provided to > match against. Because it uses the Java equals to compare and these are > different types, this will never match, which can lead to confusing results. > I believe it should at least throw an error. > {code} > define I datafu.pig.util.InUDF(); > > data = LOAD 'input' AS (B: bag {T: tuple(v:LONG)}); > > data2 = FOREACH data { > C = FILTER B By I(v, 1,2,3); > GENERATE C; > } > > describe data2; > > STORE data2 INTO 'output'; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)