Arthur created SPARK-35193:
------------------------------

             Summary: Scala/Java compatibility issue Re: how to use 
externalResource in java transformer from Scala Transformer?
                 Key: SPARK-35193
                 URL: https://issues.apache.org/jira/browse/SPARK-35193
             Project: Spark
          Issue Type: Bug
          Components: Java API, ML
    Affects Versions: 3.1.1
            Reporter: Arthur


I am trying to make a custom transformer use an externalResource, as it 
requires a large table to do the transformation. I'm not super familiar with 
scala syntax, but from snippets found on the internet I think I've made a 
proper java implementation. I am running into the following error:

Exception in thread "main" java.lang.IllegalArgumentException: requirement 
failed: Param HardMatchDetector_d95b8f699114__externalResource does not belong 
to HardMatchDetector_d95b8f699114.
 at scala.Predef$.require(Predef.scala:281)
 at org.apache.spark.ml.param.Params.shouldOwn(params.scala:851)
 at org.apache.spark.ml.param.Params.set(params.scala:727)
 at org.apache.spark.ml.param.Params.set$(params.scala:726)
 at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:41)
 at org.apache.spark.ml.param.Params.set(params.scala:713)
 at org.apache.spark.ml.param.Params.set$(params.scala:712)
 at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:41)
 at HardMatchDetector.setResource(HardMatchDetector.java:45)

 

Code as follows:
{code:java}
public class HardMatchDetector extends Transformer implements 
DefaultParamsWritable, DefaultParamsReadable, Serializable {
public String inputColumn = "value";
 public String outputColumn = "hardMatches";
 private ExternalResourceParam resourceParam = new ExternalResourceParam(this, 
"externalResource", "external resource, parquet file with 2 columns, one names 
and one wordcount");;
 private String uid;
public HardMatchDetector setResource(final ExternalResource value)
{ return (HardMatchDetector)this.set(this.resourceParam, value); }
public HardMatchDetector setResource(final String path)
{ return this.setResource(new ExternalResource(path, ReadAs.TEXT(), new 
HashMap())); }
@Override
 public String uid()
{ return getUid(); }
private String getUid() {
 if (uid == null)
{ uid = Identifiable$.MODULE$.randomUID("HardMatchDetector"); }
return uid;
 }
@Override
 public Dataset<Row> transform(final Dataset<?> dataset)
{ return dataset; }
@Override
 public StructType transformSchema(StructType schema)
{ return schema.add(DataTypes.createStructField(outputColumn, 
DataTypes.StringType, true)); }
@Override
 public Transformer copy(ParamMap extra)
{ return new HardMatchDetector(); }
}
public class HardMatcherTest extends AbstractSparkTest
{ @Test 
public void test() 
{ 
var hardMatcher = new HardMatchDetector().setResource(pathName); }
}
{code}
 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to