Hello, I'm rather new to Hive and have been playing with it the last couple weeks to see if it is appropriate to use for a particular project inside where I work. My essential question is, how to maintain data integrity inside the tables so that we don't accidentally load duplicate data. Normally we rely on indexes or unique keys to enforce this. Is there a general strategy for this in Hive?
In a second question, I haven't seen anything like it in the docs, but is there any equivalent to CASE,DECODE, or IF-THEN-ELSE allowed in the query? Thanks! -Shane P. Brady
