Thanks for the reply Alex!
I'm trying to implement a couple of scenarios. The first scenario is pretty
much what I explained in the post (i.e. appending a fixed prefix/suffix to
every field name in a pipe). The second scenario, is that I want to iterate
through all fields in a pipe and call a function on them based on their
names. For example, let's say I have a bunch of different fields in a pipe
and if the pipe name contains the string "_list_" I want to convert the
List[Any] to a sparse representation of the list in the String format. I
guess if I write a Cascading Function in java and invoke an "each" method
on my pipe that should do the trick, but I was wondering if there is a
cleaner/easier way of doing this in scalding:
import java.util.Iterator;
import cascading.operation.*;
import cascading.tuple.*;
import cascading.flow.*;
public class Sparser extends BaseOperation<Tuple> implements Function<Tuple>
{
public Sparser()
{
super(new Fields( "sum" ) );
}
public Sparser( Fields fieldDeclaration )
{
super(fieldDeclaration );
}
public void operate( FlowProcess flowProcess, FunctionCall<Tuple>
functionCall )
{
// get the arguments TupleEntry
Fields fieldNames = functionCall.getArgumentFields();
TupleEntry arguments = functionCall.getArguments();
// create a Tuple to hold our result values
Tuple result = new Tuple();
Iterator iterator = arguments.getTuple().iterator();
int i = 0;
while(iterator.hasNext())
{
Object obj = iterator.next();
if (fieldNames.get(i).toString().contains("_list_")){
java.util.List<Double> tmp = (java.util.List<Double>)obj;
String sparsRepresentation = tmp.toString();// TO BE IMPLEMENTED
result.add(sparsRepresentation);
}
else
result.add((String)obj);
i++;
}
// return the result Tuple
functionCall.getOutputCollector().add( result );
}
}
btw, I'm not sure if I understand what you mean by "an extractor method",
can you please send me a pointer to an example?
Any input is greatly appreciated!
On Friday, June 30, 2017 at 3:51:09 PM UTC-7, Alex Levenson wrote:
>
> Probably not what you want to hear, but the scalding dev team is really
> only developing + supporting the Typed API at this point -- which would
> make something like this even more difficult.
> But the question I'd probably ask is what are you trying to do, and can
> you use strong types, the Typed Api, and maybe an extractor method or
> similar instead?
>
> On Fri, Jun 30, 2017 at 2:13 PM, <[email protected] <javascript:>> wrote:
>
>> Here is the question:
>>
>> Assume I have a pipe and I want to rename all the fields in the pipe
>> programmatically, meaning that I don't want to hard code the field names in
>> my code. Any idea how I can do this?
>>
>> As a concrete example, assume I have a pipe with two fields: "name" and
>> "age" and I want to rename these fields to "employee_name" and
>> "employee_age". Obviously the natural solution is to write a piece of code
>> as below:
>>
>> pipe.rename(('name, 'age) -> ('employee_name, 'employee_age))
>>
>> or
>>
>> pipe.rename(new Fields("name", "age") -> new Fields("employee_name",
>> "employee_age"))
>>
>> However, what I need is to be able to iterate through all fields in the
>> pipe without knowing their names.
>>
>> There are a couple of methods (resolveIncomingOperationArgumentFields and
>> resolveIncomingOperationPassThroughFields) callable on a pipe which look
>> promising but the issue is that they both take and input argument of type
>> cascading.flow.planner.Scope which I don't know where can I get it from in
>> a scalding job.
>>
>> Another solution that comes to my mind is using "each" method on the pipe
>> and implementing a cascading function and pass it to the each statement.
>> But I was now able to find any sample code for that either.
>>
>> Thanks!
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Scalding Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Alex Levenson
> @THISWILLWORK
>
--
You received this message because you are subscribed to the Google Groups
"Scalding Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.