My hypothesis is that we re-use null in joins to indicate the absence of a value, so if the value of an entry is null, we assume it's non-existent. I'm assuming there isn't an easy way to switch the Void out for a non-null but ignored value?
J On Wed, Jul 30, 2014 at 9:35 AM, Mārtiņš Kalvāns <martins.kalv...@gmail.com> wrote: > Hi. > > I stumbled on weird behaviour (bug?) when joining PTable<?, Void> on left > side with any other PTable - resulting collection is empty. > Attached example code demonstrates unexpected behaviour. > Code in question is in org.apache.crunch.lib.join.InnerJoinFn line 59 > where it checks for null reference on left dataset (same for other join fn > implementations). > Anyone can comment on this? > > > -- > Mārtiņš Kalvāns > -- Director of Data Science Cloudera <http://www.cloudera.com> Twitter: @josh_wills <http://twitter.com/josh_wills>