Can you run some other queries from job1 table and see if any query returns some data?
I am guessing your query "select jobs.values.position.title from jobs1;" may have some issue. May be it should be as select jobs.values[0].position.title from jobs1; Regards, Ramki. On Wed, Jun 19, 2013 at 8:24 AM, Sunita Arvind <sunitarv...@gmail.com>wrote: > Thanks Stephen, > > That's just what I tried with the try_parsed table. It is exactly same > data with lesser nesting in the structure and lesser number of entries. > Do you mean to say that highly nested jsons can lead to issues? What are > typical solution to such issues? Write UDFs in hive or parse the JSON into > a delimited file? > I have heard of custom serdes also. Not sure if UDFs and custom serdes are > one and the same. > > regards > Sunita > > > On Wed, Jun 19, 2013 at 10:38 AM, Stephen Sprague <sprag...@gmail.com>wrote: > >> I think you might have to start small here instead of going for the home >> run on the first swing. when all else fails start with a trivial json >> object and then build up from there and see what additional step breaks >> it. that way you know if the trivial example fails is something >> fundamental and not the complexity of your json object that's throwing >> things off. >> >> >> On Wed, Jun 19, 2013 at 4:34 AM, Sunita Arvind <sunitarv...@gmail.com>wrote: >> >>> Thanks for sharing your experience Richa. >>> I do have timestamps but in the format of year : INT, day : INT, month : >>> INT. >>> As per your suggestion, I changed them all to string, but still get null >>> as the output. >>> >>> regards >>> Sunita >>> >>> >>> On Wed, Jun 19, 2013 at 2:17 AM, Richa Sharma < >>> mailtorichasha...@gmail.com> wrote: >>> >>>> Do you have any timestamp fields in the table that might contain null >>>> value ? >>>> >>>> I faced a similar situation sometime back - changing the data type to >>>> string made it work. >>>> >>>> But I was working on delimited text files. >>>> Not sure if it applies to JSON .. but its still worth giving a try !! >>>> >>>> Richa >>>> >>>> >>>> >>>> On Wed, Jun 19, 2013 at 7:28 AM, Sunita Arvind >>>> <sunitarv...@gmail.com>wrote: >>>> >>>>> Having the a column name same as the table name, is a problem due to >>>>> which I was not able to reference jobs.values.id from jobs. Changing >>>>> the table name to jobs1 resolved the semantic error. >>>>> However, the query still returns null >>>>> >>>>> hive> select jobs.values.position.title from jobs1; >>>>> Total MapReduce jobs = 1 >>>>> Launching Job 1 out of 1 >>>>> Number of reduce tasks is set to 0 since there's no reduce operator >>>>> Starting Job = job_201306080116_0036, Tracking URL = >>>>> http://node01.expressanalytics.net:50030/jobdetails.jsp?jobid=job_201306080116_0036 >>>>> Kill Command = /usr/lib/hadoop/bin/hadoop job -kill >>>>> job_201306080116_0036 >>>>> Hadoop job information for Stage-1: number of mappers: 1; number of >>>>> reducers: 0 >>>>> 2013-06-18 18:55:52,381 Stage-1 map = 0%, reduce = 0% >>>>> 2013-06-18 18:55:56,394 Stage-1 map = 100%, reduce = 0%, Cumulative >>>>> CPU 0.88 sec >>>>> 2013-06-18 18:55:57,400 Stage-1 map = 100%, reduce = 0%, Cumulative >>>>> CPU 0.88 sec >>>>> 2013-06-18 18:55:58,407 Stage-1 map = 100%, reduce = 100%, Cumulative >>>>> CPU 0.88 sec >>>>> MapReduce Total cumulative CPU time: 880 msec >>>>> Ended Job = job_201306080116_0036 >>>>> MapReduce Jobs Launched: >>>>> Job 0: Map: 1 Cumulative CPU: 0.88 sec HDFS Read: 35374 HDFS >>>>> Write: 3 SUCCESS >>>>> Total MapReduce CPU Time Spent: 880 msec >>>>> OK >>>>> null >>>>> Time taken: 9.591 seconds >>>>> >>>>> >>>>> regards >>>>> Sunita >>>>> >>>>> >>>>> On Tue, Jun 18, 2013 at 9:35 PM, Sunita Arvind >>>>> <sunitarv...@gmail.com>wrote: >>>>> >>>>>> Ok. >>>>>> The data files are quite small. Around 35 KB and 1 KB each. >>>>>> >>>>>> [sunita@node01 tables]$ hadoop fs -ls /user/sunita/tables/jobs >>>>>> Found 1 items >>>>>> -rw-r--r-- 3 sunita hdfs 35172 2013-06-18 18:31 >>>>>> /user/sunita/tables/jobs/jobs_noSite_parsed.json >>>>>> >>>>>> >>>>>> [sunita@node01 tables]$ hadoop fs -text >>>>>> /user/sunita/tables/jobs/jobs_noSite_parsed.json |more >>>>>> {"jobs": {"_total": 1812, "_count": 20, "_start": 0, "values": >>>>>> [{"company": {"i >>>>>> d": 21836, "name": "CyberCoders"}, "postingDate": {"year": 2013, >>>>>> "day": 10, "mo >>>>>> nth": 6}, "descriptionSnippet": "Software Engineer-Hadoop, HDFS, >>>>>> HBase, Pig- Ve >>>>>> rtica Analytics Senior Hadoop Engineer - Skills Required - Hadoop, >>>>>> HDFS, HBase, >>>>>> Pig, SQL, Industrial Software Development, System Integration, Java, >>>>>> high perf >>>>>> ormance, multi-threading, VerticaWe are a well known consumer product >>>>>> developme >>>>>> nt company and we are looking to add a Hadoop Engineer to our >>>>>> Engineering team. >>>>>> You will be working with the latest ", "expirationDate": {"year": >>>>>> 2013, "day" >>>>>> >>>>>> Its a single line, so used 'more' rather than 'head'. But >>>>>> effectively, the file exists and has the data. >>>>>> >>>>>> regards >>>>>> Sunita >>>>>> >>>>>> >>>>>> On Tue, Jun 18, 2013 at 8:38 PM, Stephen Sprague >>>>>> <sprag...@gmail.com>wrote: >>>>>> >>>>>>> As Nitin alluded to its best to confirm the data is definitely in >>>>>>> hdfs using hdfs semantics rather than hive for the first step. >>>>>>> >>>>>>> 1. how big is it? hadoop fs -ls <your hdfs dir> >>>>>>> 2. cat a bit of it and see if anything is there. hadoop fs -text >>>>>>> <your hdfs dir>/<filename> | head -10 >>>>>>> >>>>>>> do you see any data from step #2? >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Jun 18, 2013 at 3:58 PM, Sunita Arvind < >>>>>>> sunitarv...@gmail.com> wrote: >>>>>>> >>>>>>>> I ran some complex queries. Something to the extent of >>>>>>>> select jobs from jobs; >>>>>>>> which triggers map reduce jobs but does not show errors and >>>>>>>> produces the same output "null". If I try referencing the struct >>>>>>>> elements, >>>>>>>> I get error which seems to be the root cause. >>>>>>>> >>>>>>>> Attached are the select statement outputs with the corresponding >>>>>>>> hive logs. >>>>>>>> >>>>>>>> I have also attached my usage details of another table - try_parsed >>>>>>>> which has a subset of the same data which seems to work fine. Also >>>>>>>> attached >>>>>>>> is the input file for this table - try_parsed.json >>>>>>>> Thanks for your help >>>>>>>> >>>>>>>> Sunita >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jun 18, 2013 at 4:35 PM, Nitin Pawar < >>>>>>>> nitinpawar...@gmail.com> wrote: >>>>>>>> >>>>>>>>> can you run a little more complex query >>>>>>>>> >>>>>>>>> select uniq across columns or do some maths. so we know when it >>>>>>>>> fires up a mapreduce >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Jun 19, 2013 at 1:59 AM, Sunita Arvind < >>>>>>>>> sunitarv...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Thanks for responding Nitin. Yes I am sure that serde is working >>>>>>>>>> fine and json file is being picked based on all the errors that >>>>>>>>>> showed up >>>>>>>>>> till this stage. What sort of error are you suspecting. File not >>>>>>>>>> present or >>>>>>>>>> serde not parsing it ? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tuesday, June 18, 2013, Nitin Pawar wrote: >>>>>>>>>> >>>>>>>>>>> select * from table is as good as hdfs -cat >>>>>>>>>>> >>>>>>>>>>> are you sure there is any data in the table? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Jun 18, 2013 at 11:54 PM, Sunita Arvind < >>>>>>>>>>> sunitarv...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> I am able to parse the input JSON file and load it into hive. I >>>>>>>>>>>> do not see any errors with create table, so I am assuming that. >>>>>>>>>>>> But when I >>>>>>>>>>>> try to read the data, I get null >>>>>>>>>>>> >>>>>>>>>>>> hive> select * from jobs; >>>>>>>>>>>> OK >>>>>>>>>>>> null >>>>>>>>>>>> >>>>>>>>>>>> I have validated the JSON with JSONLint and Notepad++ JSON >>>>>>>>>>>> plugin and it is a valid JSON. Here is my create table >>>>>>>>>>>> statement and attached is the json input file. >>>>>>>>>>>> >>>>>>>>>>>> create external table jobs ( >>>>>>>>>>>> jobs STRUCT< >>>>>>>>>>>> values : ARRAY<STRUCT< >>>>>>>>>>>> company : STRUCT< >>>>>>>>>>>> id : STRING, >>>>>>>>>>>> name : STRING>, >>>>>>>>>>>> postingDate : STRUCT< >>>>>>>>>>>> year : INT, >>>>>>>>>>>> day : INT, >>>>>>>>>>>> month : INT>, >>>>>>>>>>>> descriptionSnippet : STRING, >>>>>>>>>>>> expirationDate : STRUCT< >>>>>>>>>>>> year : INT, >>>>>>>>>>>> day : INT, >>>>>>>>>>>> month : INT>, >>>>>>>>>>>> position : STRUCT< >>>>>>>>>>>> title : STRING, >>>>>>>>>>>> jobFunctions : ARRAY<STRUCT< >>>>>>>>>>>> code : STRING, >>>>>>>>>>>> name : STRING>>, >>>>>>>>>>>> industries : ARRAY<STRUCT< >>>>>>>>>>>> code : STRING, >>>>>>>>>>>> id : STRING, >>>>>>>>>>>> name : STRING>>, >>>>>>>>>>>> jobType : STRUCT< >>>>>>>>>>>> code : STRING, >>>>>>>>>>>> name : STRING>, >>>>>>>>>>>> experienceLevel : STRUCT< >>>>>>>>>>>> code : STRING, >>>>>>>>>>>> name : STRING>>, >>>>>>>>>>>> id : STRING, >>>>>>>>>>>> customerJobCode : STRING, >>>>>>>>>>>> skillsAndExperience : STRING, >>>>>>>>>>>> salary : STRING, >>>>>>>>>>>> jobPoster : STRUCT< >>>>>>>>>>>> id : STRING, >>>>>>>>>>>> firstName : STRING, >>>>>>>>>>>> lastName : STRING, >>>>>>>>>>>> headline : STRING>, >>>>>>>>>>>> referralBonus : STRING, >>>>>>>>>>>> locationDescription : STRING>>> >>>>>>>>>>>> ) >>>>>>>>>>>> ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe' >>>>>>>>>>>> LOCATION '/user/sunita/tables/jobs'; >>>>>>>>>>>> >>>>>>>>>>>> The table creation works fine, but when I attempt to query, I >>>>>>>>>>>> get null as the result. >>>>>>>>>>>> I tried adding Input/Output formats, Serde Properties, nothing >>>>>>>>>>>> seems to impact. >>>>>>>>>>>> >>>>>>>>>>>> I am of the opinion that the libraries cannot handle this level >>>>>>>>>>>> of nesting and I probably will have to write a custom serde or a >>>>>>>>>>>> parser >>>>>>>>>>>> myself. Just wanted to seek guidance before I get into that. >>>>>>>>>>>> Appreciate >>>>>>>>>>>> your help and guidance. >>>>>>>>>>>> >>>>>>>>>>>> regards >>>>>>>>>>>> Sunita >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Nitin Pawar >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Nitin Pawar >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >