> On Feb. 5, 2013, 3:43 a.m., Mark Grover wrote:
> > hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java, line 
> > 192
> > <https://reviews.apache.org/r/9276/diff/1/?file=254957#file254957line192>
> >
> >     This seems like a limited case of pattern matching. Swarnim, any way we 
> > can support generic regex matching instead?
> 
> Swarnim Kulkarni wrote:
>     Mark, in this case I specifically wanted to only allow strings that end 
> with exactly the character "*" and using String#endsWith seemed more simpler 
> and readable than a regex. Do you still want me to replace this with a regex 
> matching?
> 
> Brock Noland wrote:
>     I think the issue is that this would make it difficult to implement 
> enhanced pattern matching later. Implementing it now, you'd only need to 
> specify:
>     
>     col.*
>     
>     in the table configuration. Now the issue would be detecting if the 
> particular column was a regex pattern. Because #, comma, and : are used as 
> separators that would exclude those characters from being used.
> 
> Swarnim Kulkarni wrote:
>     Thanks Brock. Makes sense. To be sure I am understanding you right, the 
> change now would be just to replace the "parts[1].endsWith(*)" with something 
> more regexy that would still imply that the string ends with "*". Correct?
> 
> Mark Grover wrote:
>     I think that should be do it.
>     
>     Personally, I think having limited regex matching is just going to 
> confuse people, so if you could implement (and test) full Nava style regex 
> matching (like we do for RegexSerDe for example), that would be fantastic. Of 
> course, let me know if you have questions!
>     
>     Thanks for doing this, BTW!

Thanks for the suggestions. I incorporated them and updated the review. If you 
get a chance, please let me know if they look any better.


- Swarnim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9276/#review16080
-----------------------------------------------------------


On Feb. 9, 2013, 9:56 p.m., Swarnim Kulkarni wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9276/
> -----------------------------------------------------------
> 
> (Updated Feb. 9, 2013, 9:56 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Description
> -------
> 
> Added support for pulling hbase columns just by providing prefixes and a 
> wildcard. So a query now could look something like this:
> 
> CREATE EXTERNAL TABLE hive_hbase_test
> ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' 
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,fam1:col*") 
> TBLPROPERTIES ("hbase.table.name" = "TEST_HBASE_TABLE");
> 
> This would pull in all columns under column family "fam1" which start with 
> "col". This gives a little more flexibility over pull all columns format.
> 
> 
> This addresses bug HIVE-3725.
>     https://issues.apache.org/jira/browse/HIVE-3725
> 
> 
> Diffs
> -----
> 
>   hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 7f37ba5 
>   hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseCellMap.java 
> a8ba9d9 
>   hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java 
> d35bb52 
>   hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java 
> e821282 
> 
> Diff: https://reviews.apache.org/r/9276/diff/
> 
> 
> Testing
> -------
> 
> Added unit tests to demonstrate the new functionality. Also made sure that 
> all existing unit tests passed.
> 
> 
> Thanks,
> 
> Swarnim Kulkarni
> 
>

Reply via email to