Indeed it does
Josh
On Dec 2, 2008, at 8:07 PM, Zheng Shao wrote:
Does the following simpler query work?
SELECT activities.* FROM activities WHERE activities.dataset='poke'
AND activities.properties['verb'] = 'Dance';
Zheng
On Tue, Dec 2, 2008 at 7:44 PM, Josh Ferguson <[EMAIL PROTECTED]>
wrote:
These tables were already defined with those
CREATE TABLE activities
(actor_id STRING, actee_id STRING, properties MAP<STRING, STRING>)
PARTITIONED BY (account STRING, application STRING, dataset STRING,
hour INT)
CLUSTERED BY (actor_id, actee_id) INTO 32 BUCKETS
ROW FORMAT DELIMITED
COLLECTION ITEMS TERMINATED BY '44'
MAP KEYS TERMINATED BY '58'
STORED AS TEXTFILE;
Detailed Table Information:
Table(tableName:activities,dbName:default,owner:Josh,createTime:
1228208598,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:
[FieldSchema(name:actor_id,type:string,comment:null),
FieldSchema(name:actee_id,type:string,comment:null),
FieldSchema
(name:properties,type:map<string,string>,comment:null)],location:/
user/hive/warehouse/
activities
,inputFormat:org
.apache
.hadoop
.mapred
.TextInputFormat
,outputFormat:org
.apache
.hadoop
.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:
32
,serdeInfo:SerDeInfo
(name:null
,serializationLib:org
.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe,parameters:
{colelction
.delim
=
44
,mapkey
.delim
=
58
,serialization
.format
=
org
.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol}),bucketCols:
[actor_id, actee_id],sortCols:[],parameters:{}),partitionKeys:
[FieldSchema(name:account,type:string,comment:null),
FieldSchema(name:application,type:string,comment:null),
FieldSchema(name:dataset,type:string,comment:null),
FieldSchema(name:hour,type:int,comment:null)],parameters:{})
CREATE TABLE users
(id STRING, properties MAP<STRING, STRING>)
PARTITIONED BY (account STRING, application STRING, dataset STRING,
hour INT)
CLUSTERED BY (id) INTO 32 BUCKETS
ROW FORMAT DELIMITED
COLLECTION ITEMS TERMINATED BY '44'
MAP KEYS TERMINATED BY '58'
STORED AS TEXTFILE;
Detailed Table Information:
Table(tableName:users,dbName:default,owner:Josh,createTime:
1228208633,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:
[FieldSchema(name:id,type:string,comment:null),
FieldSchema
(name:properties,type:map<string,string>,comment:null)],location:/
user/hive/warehouse/
users
,inputFormat:org
.apache
.hadoop
.mapred
.TextInputFormat
,outputFormat:org
.apache
.hadoop
.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:
32
,serdeInfo:SerDeInfo
(name:null
,serializationLib:org
.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe,parameters:
{colelction
.delim
=
44
,mapkey
.delim
=
58
,serialization
.format
=
org
.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol}),bucketCols:
[id],sortCols:[],parameters:{}),partitionKeys:
[FieldSchema(name:account,type:string,comment:null),
FieldSchema(name:application,type:string,comment:null),
FieldSchema(name:dataset,type:string,comment:null),
FieldSchema(name:hour,type:int,comment:null)],parameters:{})
The Query Was:
SELECT activities.*, users.* FROM activities LEFT OUTER JOIN users
ON activities.actor_id = users.id WHERE activities.dataset='poke'
AND activities.properties['verb'] = 'Dance';
Josh Ferguson
On Dec 2, 2008, at 1:44 AM, Zheng Shao wrote:
Hi Josh,
This is a known problem.
If any of the columns are map<xxx,xxx> or list<xxx,xxx>, you have
to specify the ITEM terminator and KEY terminator. Please see the
complete CREATE table command syntax.
Let me know if it works or not.
Zheng
From: Josh Ferguson [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 02, 2008 1:36 AM
To: [email protected]
Subject: Index Expressions
I'm getting a bunch of errors during my reduce step while doing a
join between two string fields for some reason.
java.lang.RuntimeException: Hive 2 Internal error: cannot evaluate
index expression on string
at
org
.apache
.hadoop
.hive
.ql
.exec.ExprNodeIndexEvaluator.evaluate(ExprNodeIndexEvaluator.java:64)
at
org
.apache
.hadoop
.hive
.ql.exec.ExprNodeFuncEvaluator.evaluate(ExprNodeFuncEvaluator.java:
72)
at
org
.apache
.hadoop
.hive
.ql.exec.ExprNodeFuncEvaluator.evaluate(ExprNodeFuncEvaluator.java:
72)
at
org
.apache
.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:67)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:262)
at
org
.apache
.hadoop
.hive
.ql.exec.JoinOperator.createForwardJoinObject(JoinOperator.java:257)
at
org
.apache
.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:477)
at
org
.apache
.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:467)
at
org
.apache
.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:467)
at
org
.apache
.hadoop
.hive.ql.exec.JoinOperator.checkAndGenObject(JoinOperator.java:507)
at
org
.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:
489)
at
org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:
140)
at
org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:430)
at org.apache.hadoop.mapred.Child.main(Child.java:155)
The Query is
SELECT activities.*, users.* FROM activities LEFT OUTER JOIN users
ON activities.actor_id = users.id WHERE activities.dataset='poke'
AND activities.properties['verb'] = 'Dance';
hive> DESCRIBE activities;
OK
actor_id string
actee_id string
properties map<string,string>
account string
application string
dataset string
hour int
hive> DESCRIBE users;
OK
id string
properties map<string,string>
account string
application string
dataset string
hour int
Thanks
Josh
--
Yours,
Zheng