Saurabh, Yes a SerDe is one way to deal with the input. You could pre-parse the input as well, create a user defined function, or use the streaming logic in MAP/REDUCE/TRANSFORM. SerDe is hyper linked in the wiki because MashingWordsTogether creates a link by default in most wiki systems.
Edward On Mon, Jul 13, 2009 at 5:23 AM, Saurabh Nanda<[email protected]> wrote: > Hi Zheng, > > Is my interpretation about SerDes correct -- "I'm assuming the SerDe stands > for Serialization-Deserialization, and is used for importing input files > which are not in an standard format." > > Do I need SerDes to import an access log in the following format: > > ip_address "-" apache_uid [dd/MMM/yyyy:HH:mm:ss +0530] "GET /location > HTTP/1.1" response_code response_size "referrer" "user_agent_string" > "cookies" > > If possible, please could you let me know the exact CREATE TABLE and LOAD > DATA commands that I need to use to load this log file without using SerDes. > > Thanks, > Saurabh. > > On Mon, Jul 13, 2009 at 2:35 PM, Zheng Shao <[email protected]> wrote: >> >> Hi Saurabh, >> >> In most cases, you won't need to know about SerDe. >> >> We are writing a how-to for adding new SerDes. Before that, you might >> want to take a look at the code >> serde/src/org/apache/hadoop/hive/serde2 if really interested. >> >> >> Zheng >> >> On Mon, Jul 13, 2009 at 1:26 AM, Saurabh Nanda<[email protected]> >> wrote: >> > Hi, >> > >> > The DDL page in the Hive Language Manual >> > (http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL) refers to SerDe >> > (http://wiki.apache.org/hadoop/SerDe), but the page is non-existent. I'm >> > assuming the SerDe stands for Serialization-Deserialization, and is used >> > for >> > importing input files which are not in an standard format. >> > >> > Where can I find more information on how to use SerDe? >> > >> > Saurabh. >> > -- >> > http://nandz.blogspot.com >> > http://foodieforlife.blogspot.com >> > >> >> >> >> -- >> Yours, >> Zheng > > > > -- > http://nandz.blogspot.com > http://foodieforlife.blogspot.com >
