Indeed it does

Josh

On Dec 2, 2008, at 8:07 PM, Zheng Shao wrote:

Does the following simpler query work?

SELECT activities.* FROM activities WHERE activities.dataset='poke' AND activities.properties['verb'] = 'Dance';



Zheng

On Tue, Dec 2, 2008 at 7:44 PM, Josh Ferguson <[EMAIL PROTECTED]> wrote:
These tables were already defined with those

CREATE TABLE activities
(actor_id STRING, actee_id STRING, properties MAP<STRING, STRING>)
PARTITIONED BY (account STRING, application STRING, dataset STRING, hour INT)
CLUSTERED BY (actor_id, actee_id) INTO 32 BUCKETS
ROW FORMAT DELIMITED
COLLECTION ITEMS TERMINATED BY '44'
MAP KEYS TERMINATED BY '58'
STORED AS TEXTFILE;

Detailed Table Information:
Table(tableName:activities,dbName:default,owner:Josh,createTime: 1228208598,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols: [FieldSchema(name:actor_id,type:string,comment:null), FieldSchema(name:actee_id,type:string,comment:null), FieldSchema (name:properties,type:map<string,string>,comment:null)],location:/ user/hive/warehouse/ activities ,inputFormat:org .apache .hadoop .mapred .TextInputFormat ,outputFormat:org .apache .hadoop .hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets: 32 ,serdeInfo:SerDeInfo (name:null ,serializationLib:org .apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe,parameters: {colelction .delim = 44 ,mapkey .delim = 58 ,serialization .format = org .apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol}),bucketCols: [actor_id, actee_id],sortCols:[],parameters:{}),partitionKeys: [FieldSchema(name:account,type:string,comment:null), FieldSchema(name:application,type:string,comment:null), FieldSchema(name:dataset,type:string,comment:null), FieldSchema(name:hour,type:int,comment:null)],parameters:{})


CREATE TABLE users
(id STRING, properties MAP<STRING, STRING>)
PARTITIONED BY (account STRING, application STRING, dataset STRING, hour INT)
CLUSTERED BY (id) INTO 32 BUCKETS
ROW FORMAT DELIMITED
COLLECTION ITEMS TERMINATED BY '44'
MAP KEYS TERMINATED BY '58'
STORED AS TEXTFILE;

Detailed Table Information:
Table(tableName:users,dbName:default,owner:Josh,createTime: 1228208633,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols: [FieldSchema(name:id,type:string,comment:null), FieldSchema (name:properties,type:map<string,string>,comment:null)],location:/ user/hive/warehouse/ users ,inputFormat:org .apache .hadoop .mapred .TextInputFormat ,outputFormat:org .apache .hadoop .hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets: 32 ,serdeInfo:SerDeInfo (name:null ,serializationLib:org .apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe,parameters: {colelction .delim = 44 ,mapkey .delim = 58 ,serialization .format = org .apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol}),bucketCols: [id],sortCols:[],parameters:{}),partitionKeys: [FieldSchema(name:account,type:string,comment:null), FieldSchema(name:application,type:string,comment:null), FieldSchema(name:dataset,type:string,comment:null), FieldSchema(name:hour,type:int,comment:null)],parameters:{})

The Query Was:

SELECT activities.*, users.* FROM activities LEFT OUTER JOIN users ON activities.actor_id = users.id WHERE activities.dataset='poke' AND activities.properties['verb'] = 'Dance';


Josh Ferguson

On Dec 2, 2008, at 1:44 AM, Zheng Shao wrote:

Hi Josh,

This is a known problem.

If any of the columns are map<xxx,xxx> or list<xxx,xxx>, you have to specify the ITEM terminator and KEY terminator. Please see the complete CREATE table command syntax.

Let me know if it works or not.

Zheng
From: Josh Ferguson [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 02, 2008 1:36 AM
To: [email protected]
Subject: Index Expressions

I'm getting a bunch of errors during my reduce step while doing a join between two string fields for some reason.

java.lang.RuntimeException: Hive 2 Internal error: cannot evaluate index expression on string at org .apache .hadoop .hive .ql .exec.ExprNodeIndexEvaluator.evaluate(ExprNodeIndexEvaluator.java:64) at org .apache .hadoop .hive .ql.exec.ExprNodeFuncEvaluator.evaluate(ExprNodeFuncEvaluator.java: 72) at org .apache .hadoop .hive .ql.exec.ExprNodeFuncEvaluator.evaluate(ExprNodeFuncEvaluator.java: 72) at org .apache .hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:67) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:262) at org .apache .hadoop .hive .ql.exec.JoinOperator.createForwardJoinObject(JoinOperator.java:257) at org .apache .hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:477) at org .apache .hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:467) at org .apache .hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:467) at org .apache .hadoop .hive.ql.exec.JoinOperator.checkAndGenObject(JoinOperator.java:507) at org .apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java: 489) at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java: 140) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:430)
          at org.apache.hadoop.mapred.Child.main(Child.java:155)

The Query is
SELECT activities.*, users.* FROM activities LEFT OUTER JOIN users ON activities.actor_id = users.id WHERE activities.dataset='poke' AND activities.properties['verb'] = 'Dance';
hive> DESCRIBE activities;
OK
actor_id        string
actee_id        string
properties      map<string,string>
account string
application     string
dataset string
hour    int

hive> DESCRIBE users;
OK
id      string
properties      map<string,string>
account string
application     string
dataset string
hour    int

Thanks

Josh





--
Yours,
Zheng

Reply via email to