I knocked up the following when we were experimenting with Hive. I've been meaning to go and tidy it up for a while, but using it with a separator of "" (empty string) should have the desired effect. (Obviously the UDF throws an exception if the array is empty, been meaning to fix that for a while...)
Cheers, Phil. import java.util.List; import org.apache.hadoop.hive.ql.exec.Description; import org.apache.hadoop.hive.ql.exec.UDFArgumentException; import org.apache.hadoop.hive.ql.metadata.HiveException; import org.apache.hadoop.hive.ql.udf.generic.GenericUDF; import org.apache.hadoop.hive.serde2.objectinspector.ListObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector.Category; import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory; @Description(name = "implode", value = "_FUNC_(list,separator) - joins the elements of list together, separated by the given string." + " Returns a string. [Add an example here]") public class GenericUDFImplode extends GenericUDF { private ListObjectInspector listOI = null; private PrimitiveObjectInspector glueOI = null; @Override public Object evaluate(DeferredObject[] params) throws HiveException { return join(listOI.getList(params[0].get()), glueOI.getPrimitiveJavaObject(params[1].get()).toString()); } private String join(List<?> list, String separator) { if (list == null) { return null; } if (list.size() == 0) { return ""; } StringBuffer buf = new StringBuffer(); for (Object o : list) { buf.append(o); buf.append(separator); } return buf.substring(0, buf.length() - separator.length()); } @Override public String getDisplayString(String[] args) { return "implode(" + args[0] + "," + args[1] + ")"; } @Override public ObjectInspector initialize(ObjectInspector[] params) throws UDFArgumentException { if (params[0].getCategory() != Category.LIST) { throw new UDFArgumentException("Expected: List as argument 1 to implode()"); } if (params[1].getCategory() != Category.PRIMITIVE) { throw new UDFArgumentException("Expected: Primitive as argument 2 to implode()"); } listOI = (ListObjectInspector) params[0]; glueOI = (PrimitiveObjectInspector) params[1]; return PrimitiveObjectInspectorFactory.javaStringObjectInspector; } } On 15 May 2012 15:33, Nitin Pawar <nitinpawar...@gmail.com> wrote: > I will write an UDF for array concatenation and upload on GIT if anyone > does not have it already > > > On Tue, May 15, 2012 at 7:24 PM, Zoltán Tóth-Czifra < > zoltan.tothczi...@softonic.com> wrote: > >> Matt, thanks! >> >> Luckily the order of the parts of the date is correct (reordering them >> would bet he same craziness). >> >> Finally it is: >> >> regexp_replace( >> date_sub( >> to_date( >> from_unixtime( >> unix_timestamp() >> ) >> ), 1 >> ), "[-]", "" >> ) >> >> Nitin, concat apparently doesn't take arrays, and I did not find any >> other way to join arrays in HQL. However, it would be very handy. >> >> Thanks guys! >> >> ------------------------------ >> *From:* Tucker, Matt [matt.tuc...@disney.com] >> *Sent:* Tuesday, May 15, 2012 3:33 PM >> >> *To:* user@hive.apache.org >> *Subject:* RE: Date format - any easier way >> >> What about wrapping it in regexp_replace(…, “[-]”, “”) ? It may not >> be the cleanest, but I’d recommend passing variables from the shell :) >> >> >> >> Matt Tucker >> >> >> >> *From:* Zoltán Tóth-Czifra [mailto:zoltan.tothczi...@softonic.com] >> *Sent:* Tuesday, May 15, 2012 9:27 AM >> *To:* user@hive.apache.org >> *Subject:* RE: Date format - any easier way >> >> >> >> Nitin, >> >> >> >> Thank you. As you see below I know and use this function. My problem is >> that it doesn't give YYYYMMDD format, but YYYY-MM-DD instead, and >> formatting is not trivial as you can see it too. >> >> >> ------------------------------ >> >> *From:* Nitin Pawar [nitinpawar...@gmail.com] >> *Sent:* Tuesday, May 15, 2012 3:24 PM >> *To:* user@hive.apache.org >> *Subject:* Re: Date format - any easier way >> >> you may want to have a look at this function >> >> >> >> date_sub(string startdate, int days) >> >> Subtract a number of days to startdate: date_sub('2008-12-31', 1) = >> '2008-12-30' >> >> >> >> On Tue, May 15, 2012 at 6:41 PM, Zoltán Tóth-Czifra < >> zoltan.tothczi...@softonic.com> wrote: >> >> Hi guys, >> >> >> >> Thanks you very much in advance for your help. >> >> >> >> My problem in short is getting the date for yesterday in a YYYYMMDD >> format. As I use this format for partitions, I need this format in quite >> some queries. >> >> >> >> So far I have this: >> >> >> >> concat( >> >> year( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ), >> >> CASE >> >> WHEN month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) >> < 10 >> >> THEN concat( '0', month( date_sub( to_date( from_unixtime( >> unix_timestamp() ) ), 1 ) ) ) >> >> ELSE trim( month( date_sub( to_date( from_unixtime( unix_timestamp() ) ), >> 1 ) ) ) >> >> END, >> >> CASE >> >> WHEN day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 ) ) < >> 10 >> >> THEN concat( '0', day( date_sub( to_date( from_unixtime( unix_timestamp() >> ) ), 1 ) ) ) >> >> ELSE trim(day( date_sub( to_date( from_unixtime( unix_timestamp() ) ), 1 >> ) ) ) >> >> END >> >> ); >> >> >> >> >> >> ...but it seems to be a bit crazy, especially if you have to repeat it in >> hundreds of queries. Is there any other (better) way to get this format >> from yesterday? - there has to be. As I can't use local user variables nor >> macros whatsoever, I need to repeat myself a lot here. If there is no other >> way, probably I need to change my partitions. >> >> >> >> Any ideas are appreciated. Thank you! >> >> >> >> Zoltan >> >> >> >> >> >> -- >> Nitin Pawar >> > > > > -- > Nitin Pawar > >