Hi All, I have an API response in JSON format and I am trying to load it into hive tables. The create table shows no errors and returns the prompt, but when I do a "select xx from people", I just get "null". I need to know why this is happening. hive does not show any errors so there is nothing for me to debug. This query execution is what I want to monitor. Like what is the current token etc.
I'm trying to connect the hive process to netbeans, Both these commands seem to work (I get the hive prompt without errors), but not able to get them connected.: [sunita@node01 bin]$ ./hive -v -d -agentlib:jdwp=transport=dt_socket,address=x.x.x.x:8000<http://10.0.0.4:8000> (x.x.x.x is the IP address of my laptop. when I gave this, I started netbeans with Debug -> Attach Debugger in socketListen, transport = dt_socket, localaddress = x.x.x.x and port = 8000) ------------------------------------------------------------------------------------------------------------- ./hive -v -d -agentlib:jdwp=transport=dt_socket,server=y,suspend=y (When I gave this, I started netbeans as socketAttach, Host = h.h.h.h and port = 8000) h.h.h.h - IP address of the machine where I launch hive cli. Attached is the table create statement with 2 input files. I am using a JSON serde - http://files.cloudera.com/samples/hive-serdes-1.0-SNAPSHOT.jar (mentioned on this page - https://github.com/cloudera/cdh-twitter-example) I am also attaching 2 output samples. One of them (xxx20130729_0) has the profileid and fetch_year, fetch_month, fetch_day and fetch_time added. I need to get this working. For debugging purposes, I tried without these fields (xx09July2013_20) and loading just the response json without manipulations. This seems to load. I am doing something very similar to another table (adding columns) and that seems to work well. I have no clue why this table shows null and I dont see any information that will help me debug. Rather than doing permutations and combinations with columns added and their data types, I want to know why this is failing. Requesting help from the community in identifying the root cause.
create external table linkedin_PeopleSearch ( people STRUCT< values : ARRAY<STRUCT< firstName : STRING, headline : STRING, lastName : STRING, profileId : STRING, fetch_year : STRING, fetch_month : STRING, fetch_day : STRING, fetch_time : STRING, summary : STRING, siteStandardProfileRequest : STRUCT< url : STRING>, specialties : STRING, `location` : STRUCT< name : STRING>, positions : STRUCT< values : ARRAY<STRUCT< startDate : STRUCT< year : STRING, month : STRING>, title : STRING, company : STRUCT< industry : STRING, type : STRING, id: STRING, name:STRING>, summary : STRING, isCurrent : BOOLEAN, id : STRING>>>, publicProfileUrl : STRING>>> ) ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe' LOCATION '/user/sunita/Linkedin/PeopleSearch';