To be fair to the good doctor, I was a whiner patient :) Yes, I feel your pain. I am working on a boolean patch, hope to finish it this weekend: https://issues.apache.org/jira/browse/PIG-1429
You will be able to do all sane things with booleans as first class data types. We just happen to have booleans in our data because they exist as a type in Voldemort, and we use VoldemortStorage format for data on HDFS. We have UDFs to evaluate them, so we can use them that way via isTrue(foo:boolean) UDF. Actually building them into Pig as data types seemed like a good first step to adding DateTime support - what I really want to do. A lot of the work is common ground. The most common use of Pig is log processing, so Pig should have datetimes. So should HIVE. The DateTime UDFs aren't enough. Russ On Thu, Jun 24, 2010 at 1:26 PM, hc busy <hc.b...@gmail.com> wrote: > Russell, fire that neurologist who didn't care about what you had to think > about your own problems!! ;-) > > But honestly though, I use booleans in my pig scripts too. The trouble is > that you can store a boolean into a row (awkwardly by returning from UDF: > "t=foreach u generate IsEmpty(u.bag) as bool_var;" ), but you can't > actually > use a boolean column. > > T = filter t by bool_var; > > There are oppositions to writing scripts using this, the argument is that > wtf bother with this boolean column? The more natural thing in Pig would be > to split on that boolean column and then use the two aliases separately. > Personally, to me, having boolean column makes it easier to think about... > > Anyways, where have you (guys all) use booleans in pig? > > > On Thu, Jun 24, 2010 at 11:51 AM, Russell Jurney > <russell.jur...@gmail.com>wrote: > > > Wrote a... thing about Pig at LinkedIn that might be useful to some: > > > > > http://sna-projects.com/blog/2010/06/when-pigs-fly-apache-pig-open-source-and-understanding-systems/ > > > > Russ > > >