Hi Tanujit, Responses in-line:
On Sat, Aug 24, 2013 at 4:35 AM, Tanujit Ghosh <[email protected]>wrote: > Hi All, > > I put together a simple implementation of the upper() string function. > > Few questions > > 1. Is this the correct way, just wanted to understand to understand the > ByteBuf and whether we should be doing the setBytes with the string. > Another obvious optimization that can consider the byte as number and then > intrepret to ascii value and if it falls int he lower change change the > value to higher value. > Just wanted to know what is the way that would be better. > The supplied code appears correct. Note that the current default encoding for Drill is UTF-8, while the Java String is UTF-16. Ideally, the upper() function would operate directly on the UTF-8 encoded ByteBufs and avoid allocating a temporary string. But this seems just fine for now. > 2. It would be helpful to get an explainatin on the bytebuf and how its > wrapped around in the vector and holder objects and how this is supposed to > be shared. > A *ByteBuf* is a pooled, reference-counted resource (we actually use the * PooledByteBuf*). A *ValueVector* reference a ByteBuf, which stores a vector of values for a single field. A *VectorHolder* can be thought of as a reference to a single value in the ValueVector. When a ValueVector is 'done' with the ByteBuf, the buffer's reference count is decremented and can be reclaimed in the pool. See BaseDataValueVector.clear(). > > > @FunctionTemplate(name = "uppercase", scope = > FunctionTemplate.FunctionScope.SIMPLE, > nulls = FunctionTemplate.NullHandling.NULL_IF_NULL) > public static class UpperString implements DrillFunc { > > @Param VarCharHolder input; > @Output VarCharHolder out; > > public void setup(RecordBatch b){} > > public void eval(){ > out.buffer = input.buffer; > out.start = input.start; > out.end = input.end; > > String inputStr = input.toString().toUpperCase(); > out.buffer.setBytes(input.start, inputStr.getBytes()); > } > > public static class Provider implements CallProvider { > @Override > public FunctionDefinition[] getFunctionDefintions() { > return new FunctionDefinition[] { > FunctionDefinition.simple("uppercase", > new BasicArgumentValidator(new > Arg(Types.required(TypeProtos.MinorType.VARCHAR), > > Types.required(TypeProtos.MinorType.VAR16CHAR))), > I'm not sure you need to worry about VAR16CHAR for now, but you may want to allow Types.optional(VARCHAR). The NULL_IF_NULL declaration will handle null values automatically. > new OutputTypeDeterminer.SameAsFirstInput(), > "upper") > }; > } > } > } > > > Thanks and Regards, > Tanujit > Regards, -Ben
