Thank you for replying,
In fact, i'm trying to run this script:
grunt> rows = LOAD 'cassandra://Keyspace1/Users' USING CassandraStorage()
AS (key, columns: bag {T: tuple(name, value)});
grunt> cols = FOREACH rows GENERATE flatten(columns);
grunt> colnames = FOREACH cols GENERATE $0;
grunt> namegroups = GROUP colnames BY (chararray) $0;
grunt> namecounts = FOREACH namegroups GENERATE COUNT($1), group;
grunt> orderednames = ORDER namecounts BY $0;
grunt> topnames = LIMIT orderednames 50;
grunt> dump topnames;

but i'm having this error:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve
CassandraStorage using imports: [, org.apache.pig.builtin.,
org.apache.pig.impl.builtin.]




2013/3/18 Dan DeCapria, CivicScience <[email protected]>

> Storing to Cassandra requires a key->column->value data structure from pig.
>  Here is one possible approach, requiring a udf to handle the pig
> formatting interchange to cassandra:
>
> -- sample pig script
> A = LOAD 'foo' USING PigStorage() AS (key:chararray, name:chararray,
> value:chararray);
> B = FOREACH A GENERATE
>
> FLATTEN(com.to.udf.ToCassandraUDF(TOTUPLE('PotentialKeyManipulation/',$0,'/toSomethingElse'),
> TOTUPLE($1), $2));
> STORE B INTO 'cassandra://cassandraNamespace/myColumnFamilyName' USING
> org.apache.cassandra.hadoop.pig.CassandraStorage();
>
> -- sample toCassandraUDF
> package com.to.udf.ToCassandraUDF;
>
> public class ToCassandraUDF extends EvalFunc<Tuple> {
>
> public Tuple exec(Tuple input) throws IOException {
> Tuple row = TupleFactory.getInstance().newTuple(2);
> StringBuffer buf = null;
>  Tuple keyParts = (Tuple) input.get(0);
> buf = new StringBuffer();
> for (Object o : keyParts.getAll()) {
> if (o != null) {
> buf.append(o.toString());
> } else {
> buf.append("null");
> }
> }
> String key = buf.toString();
>  Tuple columnParts = (Tuple) input.get(1);
> buf = new StringBuffer();
> for (Object o : columnParts.getAll()) {
> if (o != null) {
> buf.append(o.toString());
> } else {
> buf.append("null");
> }
> }
> String columnName = buf.toString();
>
> byte[] columnValueBytes = null;
> if (input.size() > 2) {
> Object value = input.get(2);
> columnValueBytes = value.toString().getBytes();
> } else {
> columnValueBytes = new byte[0];
> }
>
> Tuple column = TupleFactory.getInstance().newTuple(2);
> column.set(0, new DataByteArray(columnName.getBytes()));
> column.set(1, new DataByteArray(columnValueBytes));
>
> DataBag bagOfColumns = BagFactory.getInstance().newDefaultBag();
> bagOfColumns.add(column);
>
> row.set(0, key);
> row.set(1, bagOfColumns);
>
> return row;
> }
>
> public Schema outputSchema(Schema input) {
> try {
> Schema.FieldSchema keyField = new Schema.FieldSchema("key",
> DataType.CHARARRAY);
> Schema.FieldSchema nameField = new Schema.FieldSchema("name",
> DataType.CHARARRAY);
> Schema.FieldSchema valueField = new Schema.FieldSchema("value",
> DataType.BYTEARRAY);
>
> Schema bagSchema = new Schema();
> bagSchema.add(nameField);
> bagSchema.add(valueField);
>
> Schema.FieldSchema columnsField = new Schema.FieldSchema("columns",
> bagSchema, DataType.BAG);
>
> Schema innerSchema = new Schema();
> innerSchema.add(keyField);
> innerSchema.add(columnsField);
>
> Schema.FieldSchema cassandraTuple = new Schema.FieldSchema(
> "cassandra_tuple", innerSchema, DataType.TUPLE);
>
> Schema schema = new Schema(cassandraTuple);
> schema.setTwoLevelAccessRequired(true);
> return schema;
> } catch (Exception e) {
> return null;
> }
> }
> }
>
> Hope this helps.
>
> -Dan
>
> On Mon, Mar 18, 2013 at 11:15 AM, Mohammed Abdelkhalek <
> [email protected]> wrote:
>
> > Hi,
> > i'm using hadoop 1.0.4, cassandra 1.2.2 and pig 0.11.0.
> > Can any one help me with an example on how to use pig either for Storing
> to
> > cassandra from *pig* using Cassandrastorage, or Loading rows from
> cassandra
> > in order to use them with pig.
> > Thanks.
> >
> > --
> > Mohammed ABDELKHALEK
> > Ingénieur d'état de l’École Mohammadia d'Ingénieurs
> > Technologies et services de l'information
> > Téléphone: +212 6 45 64 65 68
> >
> > Préservons l'environnement. N'imprimez ce courriel que si nécessaire.
> > Please consider the environment before printing this e-mail.
> >
>



-- 
Mohammed ABDELKHALEK
Ingénieur d'état de l’École Mohammadia d'Ingénieurs
Technologies et services de l'information
Téléphone: +212 6 45 64 65 68

Préservons l'environnement. N'imprimez ce courriel que si nécessaire.
Please consider the environment before printing this e-mail.

Reply via email to