Yes, I think at least documentation about know issue could help. Thanks!
2014-07-31 17:09 GMT+02:00 Josh Wills <jwi...@cloudera.com>: > Understood. Anything I can do to help? Docfix, at least? > > > On Thu, Jul 31, 2014 at 1:08 AM, Mārtiņš Kalvāns < > martins.kalv...@gmail.com> > wrote: > > > It is avoidable almost always, problem is that in our company Crunch user > > base is growing and many of them are "not so technical" to fast and > > effectively catch problems like this and find workarounds. :( > > > > > > -- > > Mārtiņš > > > > > > 2014-07-30 18:45 GMT+02:00 Josh Wills <jwi...@cloudera.com>: > > > > > My hypothesis is that we re-use null in joins to indicate the absence > of > > a > > > value, so if the value of an entry is null, we assume it's > non-existent. > > > I'm assuming there isn't an easy way to switch the Void out for a > > non-null > > > but ignored value? > > > > > > J > > > > > > > > > On Wed, Jul 30, 2014 at 9:35 AM, Mārtiņš Kalvāns < > > > martins.kalv...@gmail.com> > > > wrote: > > > > > > > Hi. > > > > > > > > I stumbled on weird behaviour (bug?) when joining PTable<?, Void> on > > left > > > > side with any other PTable - resulting collection is empty. > > > > Attached example code demonstrates unexpected behaviour. > > > > Code in question is in org.apache.crunch.lib.join.InnerJoinFn line 59 > > > > where it checks for null reference on left dataset (same for other > join > > > fn > > > > implementations). > > > > Anyone can comment on this? > > > > > > > > > > > > -- > > > > Mārtiņš Kalvāns > > > > > > > > > > > > > > > > -- > > > Director of Data Science > > > Cloudera <http://www.cloudera.com> > > > Twitter: @josh_wills <http://twitter.com/josh_wills> > > > > > > > > > -- > Director of Data Science > Cloudera <http://www.cloudera.com> > Twitter: @josh_wills <http://twitter.com/josh_wills> >