> On March 28, 2014, 6:54 p.m., Rohini Palaniswamy wrote:
> > We should call it dataInPartition as it only works if there is one 
> > partition. 
> > 
> > I like the idea of the generalization, but I can't seem to be able to think 
> > of another use case where the current generalization will actually help. 
> > Can you list out some usecases so that we are sure that this generalization 
> > is actually useful and can be applied elsewhere?
> 
> Satish Mittal wrote:
>     Hi Rohini, the reason I named it dataInPartitions is so that it is 
> consistent with dataOutPartitions. Please let me know if you still think we 
> should rename it.
>     
>     Generalizing the separators and equals string to create partition value 
> will prevent unnecessarily introduce new EL functions just to create a 
> partition value which only differs in a particular aspect, e.g. a) suppose 
> the desired partition value for a particular feature in Pig/Hive/New  action 
> type is ';'/'AND/OR' separated instead of current ','. Or b) suppose the 
> expected equality string is '==' instead of current '='. Or c) It could be a 
> combination of a) and b).
>     We may not be having such a scenario immediately, but it is not hard to 
> imagine one in future, given the fact that partition value/filter formats are 
> not standardized even today across Pig/Hive. In such a case, we don't want to 
> have an overloaded dataInPartition EL for each new format and it is good to 
> be future-proof.
> 
> Rohini Palaniswamy wrote:
>     Since you are throwing error on more than one partition, it has to be 
> dataInPartition. If another EL has to be added to get all partitions, this 
> should not clash with it.
>     
>     My question is, is it really future proof and has it considered all 
> possibilities? Current EL can only address (col1="value1",col2="value2"), 
> (col1="value1" AND col2="value2") and (col1=="value1" AND col2=="value2"). It 
> is not easy to change the EL Functions or overload it (according to Shwetha 
> in OOZIE-1757) and a new one has to be created if this one does not work. For 
> eg: What about ' vs "". What if the argument expected was something like 
> ("minute=00,month=03,year=2014,hour=09") which is used by hcat (OOZIE-1757) 
> for output currently? This function cannot be applied there. Even if future 
> possibilities cannot be thought, look at all existing usages in pig and hive 
> and enhance this function.

Actually you can keep it dataInPartitions, by adding a separator for different 
partitions/instances so that the same EL will handle one or more 
partition/instance.


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19449/#review38930
-----------------------------------------------------------


On March 28, 2014, 12:38 p.m., Satish Mittal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19449/
> -----------------------------------------------------------
> 
> (Updated March 28, 2014, 12:38 p.m.)
> 
> 
> Review request for oozie and Ryota Egashira.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> Currently oozie provides coord EL functions to get partition filter string 
> (where key-values are separated by AND) for various action types - 
> java/pig/hive. However this doesn't work in other cases, e.g. hive action 
> that performs export/import of hive partition. In that case, the partition 
> value is expected to be in a different format: 
> (col1="value1",col2="value2"...). 
> 
> If we use any existing EL function like coord:dataInPartitionFilter, the 
> value of partition looks like:
> (minute='00' AND month='03' AND year='2014' AND hour='09' AND day='11')
> In this case, Hive export fails with error:
> FAILED: ParseException line 1:51 mismatched input 'AND' expecting ) near 
> ''00'' in export statement
> 
> We need to add a new EL function that returns partition value in the above 
> format.
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/coord/HCatELFunctions.java e5f0146 
>   core/src/main/resources/oozie-default.xml 34362aa 
>   core/src/test/java/org/apache/oozie/coord/TestHCatELFunctions.java f46f1ec 
>   docs/src/site/twiki/CoordinatorFunctionalSpec.twiki 44a3d54 
>   sharelib/hcatalog/src/main/java/org/apache/oozie/util/HCatURI.java d797f9b 
> 
> Diff: https://reviews.apache.org/r/19449/diff/
> 
> 
> Testing
> -------
> 
> - Manually verified the new EL function on my local setup.
> - Added new test cases for the same.
> 
> 
> Thanks,
> 
> Satish Mittal
> 
>

Reply via email to