pig-user  

is SUBSTRING's behavior desireble?

Dmitriy Ryaboy
Fri, 22 Jan 2010 10:20:56 -0800

currently, Pig's SUBSTRING (in piggybank) takes parameters (string,
startIndex, endIndex).

If endindex is past the end of the string, an error is logged and the
string is dropped (a null is returned). This is consistent with Java's
String.substring().  It seems to me that while this makes sense in
Java, this is not desirable in Pig where you can't catch an exception,
do runtime length checking, etc. I would prefer to have SUBSTRING
avoid the Java exception by calling str.substring(beginIndex,
min(str.length-1, endIndex)).

Thoughts?

-D