Hi all, I am running following simple script to read data from a cassandra column family and write them to a mysql database in my local host.
*CREATE EXTERNAL TABLE IF NOT EXISTS hourlyLog * * (id STRING, siteIp STRING, userIp STRING, size INT, timesta STRING) * * STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler'* * WITH SERDEPROPERTIES ( * * "wso2.carbon.datasource.name <http://wso2.carbon.datasource.name>" = "Wso2_log_analyzer" ,* * "cassandra.cf.name <http://cassandra.cf.name>" = "web_log_entry" , * * "cassandra.columns.mapping" = ":key, UserIp, SiteIp, dataAmount, timestamp" );* *CREATE EXTERNAL TABLE IF NOT EXISTS hourlyusage(* * userIp STRING, siteIp STRING, amount INT, hour STRING) * * STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler' * * TBLPROPERTIES ( * * 'wso2.carbon.datasource.name <http://wso2.carbon.datasource.name>'='Log_Analyzer',* * 'hive.jdbc.update.on.duplicate' = 'true' , * * 'hive.jdbc.primary.key.fields' = 'userIp' , * * 'hive.jdbc.table.create.query' = * * 'CREATE TABLE hourlyusage (userIp VARCHAR(15) NOT NULL , * * siteIp VARCHAR(15) NOT NULL, amount INT, hour VARCHAR(13))' ); * *insert overwrite table hourlyusage * * select userIp, SiteIp, size, '1111111111' from hourlyLog;* *SELECT userIp, siteIp, amount, hour FROM hourlyusage;* For reading 3000 entries it took about 5 minutes . For reading cassandra it takes only less than 1 minute. Is this normal? If so is there any way I can speed it up? Will using H2 instead of mysql speed up the script? Thank You. -- *Chamila Wijayarathna* Engineering Intern, WSO2 Inc.
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
