> On Feb. 5, 2013, 3:43 a.m., Mark Grover wrote: > > hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java, line > > 192 > > <https://reviews.apache.org/r/9276/diff/1/?file=254957#file254957line192> > > > > This seems like a limited case of pattern matching. Swarnim, any way we > > can support generic regex matching instead? > > Swarnim Kulkarni wrote: > Mark, in this case I specifically wanted to only allow strings that end > with exactly the character "*" and using String#endsWith seemed more simpler > and readable than a regex. Do you still want me to replace this with a regex > matching? > > Brock Noland wrote: > I think the issue is that this would make it difficult to implement > enhanced pattern matching later. Implementing it now, you'd only need to > specify: > > col.* > > in the table configuration. Now the issue would be detecting if the > particular column was a regex pattern. Because #, comma, and : are used as > separators that would exclude those characters from being used. > > Swarnim Kulkarni wrote: > Thanks Brock. Makes sense. To be sure I am understanding you right, the > change now would be just to replace the "parts[1].endsWith(*)" with something > more regexy that would still imply that the string ends with "*". Correct? > > Mark Grover wrote: > I think that should be do it. > > Personally, I think having limited regex matching is just going to > confuse people, so if you could implement (and test) full Nava style regex > matching (like we do for RegexSerDe for example), that would be fantastic. Of > course, let me know if you have questions! > > Thanks for doing this, BTW!
Thanks for the suggestions. I incorporated them and updated the review. If you get a chance, please let me know if they look any better. - Swarnim ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9276/#review16080 ----------------------------------------------------------- On Feb. 9, 2013, 9:56 p.m., Swarnim Kulkarni wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/9276/ > ----------------------------------------------------------- > > (Updated Feb. 9, 2013, 9:56 p.m.) > > > Review request for hive. > > > Description > ------- > > Added support for pulling hbase columns just by providing prefixes and a > wildcard. So a query now could look something like this: > > CREATE EXTERNAL TABLE hive_hbase_test > ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,fam1:col*") > TBLPROPERTIES ("hbase.table.name" = "TEST_HBASE_TABLE"); > > This would pull in all columns under column family "fam1" which start with > "col". This gives a little more flexibility over pull all columns format. > > > This addresses bug HIVE-3725. > https://issues.apache.org/jira/browse/HIVE-3725 > > > Diffs > ----- > > hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 7f37ba5 > hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseCellMap.java > a8ba9d9 > hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java > d35bb52 > hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java > e821282 > > Diff: https://reviews.apache.org/r/9276/diff/ > > > Testing > ------- > > Added unit tests to demonstrate the new functionality. Also made sure that > all existing unit tests passed. > > > Thanks, > > Swarnim Kulkarni > >