Hi all,
I am running following simple script to read data from a cassandra column
family and write them to a mysql database in my local host.

*CREATE EXTERNAL TABLE IF NOT EXISTS hourlyLog *
* (id STRING, siteIp STRING, userIp STRING,  size INT, timesta STRING) *
* STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'*
* WITH SERDEPROPERTIES ( *
* "wso2.carbon.datasource.name <http://wso2.carbon.datasource.name>" =
"Wso2_log_analyzer" ,*
* "cassandra.cf.name <http://cassandra.cf.name>" = "web_log_entry" , *
* "cassandra.columns.mapping" = ":key, UserIp, SiteIp, dataAmount,
timestamp" );*
 *CREATE EXTERNAL TABLE IF NOT EXISTS hourlyusage(*
* userIp STRING, siteIp STRING,  amount INT, hour STRING) *
* STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler' *
* TBLPROPERTIES ( *
*    'wso2.carbon.datasource.name
<http://wso2.carbon.datasource.name>'='Log_Analyzer',*
* 'hive.jdbc.update.on.duplicate' = 'true' , *
* 'hive.jdbc.primary.key.fields' = 'userIp' , *
* 'hive.jdbc.table.create.query' = *
* 'CREATE TABLE hourlyusage (userIp VARCHAR(15) NOT NULL , *
* siteIp VARCHAR(15) NOT NULL, amount  INT, hour VARCHAR(13))' ); *


 *insert overwrite table hourlyusage *
* select userIp, SiteIp, size, '1111111111' from hourlyLog;*
*SELECT userIp, siteIp, amount, hour FROM hourlyusage;*

For reading 3000 entries it took about 5 minutes . For reading cassandra it
takes only less than 1 minute. Is this normal? If so is there any way I can
speed it up?
Will using H2 instead of mysql speed up the script?

Thank You.


-- 
*Chamila Wijayarathna*
Engineering Intern,
WSO2 Inc.
_______________________________________________
Dev mailing list
[email protected]
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to