[jira] [Comment Edited] (SPARK-12606) Scala/Java compatibility issue Re: how to extend java transformer from Scala UnaryTransformer ?

2017-10-04 Thread Akos Tomasits (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190904#comment-16190904
 ] 

Akos Tomasits edited comment on SPARK-12606 at 10/4/17 9:02 AM:


We have run into the same issue. We cannot create proper Java transformers 
derived from UnaryTransformer.

We would like to use these custom transformers through CrossValidator, that in 
the end requires a constructor with a string (uid) parameter. I guess the 
custom transformer is supposed to set the provided uid in this constructor, 
however, the object's uid() method is called before the constructor finishes. 
This leads to the above mentioned "null__inputCol" error.

I have created a new JIRA issue for this problem: SPARK-22198


was (Author: akos.tomasits):
We have run into the same issue. We cannot create proper Java transformers 
derived from UnaryTransformer.

We would like to use these custom transformers through CrossValidator, that in 
the end requires a constructor with a string (uid) parameter. I guess the 
custom transformer is supposed to set the provided uid in this constructor, 
however, the object's uid() method is called before the constructor finishes. 
This leads to the above mentioned "null__inputCol" error.

> Scala/Java compatibility issue Re: how to extend java transformer from Scala 
> UnaryTransformer ?
> ---
>
> Key: SPARK-12606
> URL: https://issues.apache.org/jira/browse/SPARK-12606
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 1.5.2
> Environment: Java 8, Mac OS, Spark-1.5.2
>Reporter: Andrew Davidson
>  Labels: transformers
>
> Hi Andy,
> I suspect that you hit the Scala/Java compatibility issue, I can also 
> reproduce this issue, so could you file a JIRA to track this issue?
> Yanbo
> 2016-01-02 3:38 GMT+08:00 Andy Davidson :
> I am trying to write a trivial transformer I use use in my pipeline. I am 
> using java and spark 1.5.2. It was suggested that I use the Tokenize.scala 
> class as an example. This should be very easy how ever I do not understand 
> Scala, I am having trouble debugging the following exception.
> Any help would be greatly appreciated.
> Happy New Year
> Andy
> java.lang.IllegalArgumentException: requirement failed: Param null__inputCol 
> does not belong to Stemmer_2f3aa96d-7919-4eaa-ad54-f7c620b92d1c.
>   at scala.Predef$.require(Predef.scala:233)
>   at org.apache.spark.ml.param.Params$class.shouldOwn(params.scala:557)
>   at org.apache.spark.ml.param.Params$class.set(params.scala:436)
>   at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:37)
>   at org.apache.spark.ml.param.Params$class.set(params.scala:422)
>   at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:37)
>   at 
> org.apache.spark.ml.UnaryTransformer.setInputCol(Transformer.scala:83)
>   at com.pws.xxx.ml.StemmerTest.test(StemmerTest.java:30)
> public class StemmerTest extends AbstractSparkTest {
> @Test
> public void test() {
> Stemmer stemmer = new Stemmer()
> .setInputCol("raw”) //line 30
> .setOutputCol("filtered");
> }
> }
> /**
>  * @ see 
> spark-1.5.1/mllib/src/main/scala/org/apache/spark/ml/feature/Tokenizer.scala
>  * @ see 
> https://chimpler.wordpress.com/2014/06/11/classifiying-documents-using-naive-bayes-on-apache-spark-mllib/
>  * @ see 
> http://www.tonytruong.net/movie-rating-prediction-with-apache-spark-and-hortonworks/
>  * 
>  * @author andrewdavidson
>  *
>  */
> public class Stemmer extends UnaryTransformer Stemmer> implements Serializable{
> static Logger logger = LoggerFactory.getLogger(Stemmer.class);
> private static final long serialVersionUID = 1L;
> private static final  ArrayType inputType = 
> DataTypes.createArrayType(DataTypes.StringType, true);
> private final String uid = Stemmer.class.getSimpleName() + "_" + 
> UUID.randomUUID().toString();
> @Override
> public String uid() {
> return uid;
> }
> /*
>override protected def validateInputType(inputType: DataType): Unit = {
> require(inputType == StringType, s"Input type must be string type but got 
> $inputType.")
>   }
>  */
> @Override
> public void validateInputType(DataType inputTypeArg) {
> String msg = "inputType must be " + inputType.simpleString() + " but 
> got " + inputTypeArg.simpleString();
> assert (inputType.equals(inputTypeArg)) : msg; 
> }
> 
> @Override
> public Function1 createTransformFunc() {
> // 
> http://stackoverflow.com/questions/6545066/using-scala-from-java-passing-functions-as-parameters
>   

[jira] [Comment Edited] (SPARK-12606) Scala/Java compatibility issue Re: how to extend java transformer from Scala UnaryTransformer ?

2017-06-16 Thread Ilyes Hachani (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16044508#comment-16044508
 ] 

Ilyes Hachani edited comment on SPARK-12606 at 6/16/17 9:52 AM:


Any update on this?

I am using Java and extending the class but I get problem when I use 
"setInputCol"

{code}
"java.lang.IllegalArgumentException: requirement failed: Param null__inputCol 
does not belong to TextCleaner_40f8c0be7bc7."
{code}

I tried setting the uid inside the constructor to no result.
{code:title=TextCleaner.java|borderStyle=solid}
private final String uid ;
public TextCleaner(){
uid =  Identifiable$.MODULE$.randomUID("TextCleaner");
}
@Override
public String uid() {
return uid;
}
{code}
Spark version 2.1.1


was (Author: ihachani):
Any update on this?

I am using Java and extending the class but I get problem when I use 
"setInputCol"

{code}
"java.lang.IllegalArgumentException: requirement failed: Param null__inputCol 
does not belong to TextCleaner_40f8c0be7bc7."
{code}

I tried setting the uid inside the constructor to no result.
{code:title=TextCleaner.java|borderStyle=solid}
private final String uid ;
public TextCleaner(){
uid =  Identifiable$.MODULE$.randomUID("TextCleaner");
}
@Override
public String uid() {
return uid;
}
{code}

> Scala/Java compatibility issue Re: how to extend java transformer from Scala 
> UnaryTransformer ?
> ---
>
> Key: SPARK-12606
> URL: https://issues.apache.org/jira/browse/SPARK-12606
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 1.5.2
> Environment: Java 8, Mac OS, Spark-1.5.2
>Reporter: Andrew Davidson
>  Labels: transformers
>
> Hi Andy,
> I suspect that you hit the Scala/Java compatibility issue, I can also 
> reproduce this issue, so could you file a JIRA to track this issue?
> Yanbo
> 2016-01-02 3:38 GMT+08:00 Andy Davidson :
> I am trying to write a trivial transformer I use use in my pipeline. I am 
> using java and spark 1.5.2. It was suggested that I use the Tokenize.scala 
> class as an example. This should be very easy how ever I do not understand 
> Scala, I am having trouble debugging the following exception.
> Any help would be greatly appreciated.
> Happy New Year
> Andy
> java.lang.IllegalArgumentException: requirement failed: Param null__inputCol 
> does not belong to Stemmer_2f3aa96d-7919-4eaa-ad54-f7c620b92d1c.
>   at scala.Predef$.require(Predef.scala:233)
>   at org.apache.spark.ml.param.Params$class.shouldOwn(params.scala:557)
>   at org.apache.spark.ml.param.Params$class.set(params.scala:436)
>   at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:37)
>   at org.apache.spark.ml.param.Params$class.set(params.scala:422)
>   at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:37)
>   at 
> org.apache.spark.ml.UnaryTransformer.setInputCol(Transformer.scala:83)
>   at com.pws.xxx.ml.StemmerTest.test(StemmerTest.java:30)
> public class StemmerTest extends AbstractSparkTest {
> @Test
> public void test() {
> Stemmer stemmer = new Stemmer()
> .setInputCol("raw”) //line 30
> .setOutputCol("filtered");
> }
> }
> /**
>  * @ see 
> spark-1.5.1/mllib/src/main/scala/org/apache/spark/ml/feature/Tokenizer.scala
>  * @ see 
> https://chimpler.wordpress.com/2014/06/11/classifiying-documents-using-naive-bayes-on-apache-spark-mllib/
>  * @ see 
> http://www.tonytruong.net/movie-rating-prediction-with-apache-spark-and-hortonworks/
>  * 
>  * @author andrewdavidson
>  *
>  */
> public class Stemmer extends UnaryTransformer Stemmer> implements Serializable{
> static Logger logger = LoggerFactory.getLogger(Stemmer.class);
> private static final long serialVersionUID = 1L;
> private static final  ArrayType inputType = 
> DataTypes.createArrayType(DataTypes.StringType, true);
> private final String uid = Stemmer.class.getSimpleName() + "_" + 
> UUID.randomUUID().toString();
> @Override
> public String uid() {
> return uid;
> }
> /*
>override protected def validateInputType(inputType: DataType): Unit = {
> require(inputType == StringType, s"Input type must be string type but got 
> $inputType.")
>   }
>  */
> @Override
> public void validateInputType(DataType inputTypeArg) {
> String msg = "inputType must be " + inputType.simpleString() + " but 
> got " + inputTypeArg.simpleString();
> assert (inputType.equals(inputTypeArg)) : msg; 
> }
> 
> @Override
> public Function1 createTransformFunc() {
> // 

[jira] [Comment Edited] (SPARK-12606) Scala/Java compatibility issue Re: how to extend java transformer from Scala UnaryTransformer ?

2017-06-01 Thread Guillaume Dardelet (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955191#comment-15955191
 ] 

Guillaume Dardelet edited comment on SPARK-12606 at 6/1/17 9:55 AM:


I had the same issue in Scala and I solved it by overloading the constructor so 
that it initialises the UID.

The error comes from the initialisation of the parameter "inputCol".
You get "null__inputCol" because when the parameter was initialised, your class 
didn't have a uid.

Therefore, instead of

{code}
class Lemmatizer extends UnaryTransformer[String, String, Lemmatizer] {
  override val uid: String = Identifiable.randomUID("lemmatizer")
  protected def createTransformFunc: String => String = ???
  protected def outputDataType: DataType = StringType
}
{code}

Do this:

{code}
class Lemmatizer(override val uid: String) extends UnaryTransformer[String, 
String, Lemmatizer] {
  def this() = this( Identifiable.randomUID("lemmatizer") )
  protected def createTransformFunc: String => String = ???
  protected def outputDataType: DataType = StringType
}
{code}


was (Author: panoramix):
I had the same issue in Scala and I solved it by overloading the constructor so 
that it initialises the UID.

The error comes from the initialisation of the parameter "inputCol".
You get "null__inputCol" because when the parameter was initialised, your class 
didn't have a uid.

Therefore, instead of

{code}
class Lemmatizer extends UnaryTransformer[String, String, Lemmatizer] {
  override val uid: String = Identifiable.randomUID("lemmatizer")
  protected def createTransformFunc: String) => String = ???
  protected def outputDataType: DataType = StringType
}
{code}

Do this:

{code}
class Lemmatizer(override val uid: String) extends UnaryTransformer[String, 
String, Lemmatizer] {
  def this() = this( Identifiable.randomUID("lemmatizer") )
  protected def createTransformFunc: String) => String = ???
  protected def outputDataType: DataType = StringType
}
{code}

> Scala/Java compatibility issue Re: how to extend java transformer from Scala 
> UnaryTransformer ?
> ---
>
> Key: SPARK-12606
> URL: https://issues.apache.org/jira/browse/SPARK-12606
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 1.5.2
> Environment: Java 8, Mac OS, Spark-1.5.2
>Reporter: Andrew Davidson
>  Labels: transformers
>
> Hi Andy,
> I suspect that you hit the Scala/Java compatibility issue, I can also 
> reproduce this issue, so could you file a JIRA to track this issue?
> Yanbo
> 2016-01-02 3:38 GMT+08:00 Andy Davidson :
> I am trying to write a trivial transformer I use use in my pipeline. I am 
> using java and spark 1.5.2. It was suggested that I use the Tokenize.scala 
> class as an example. This should be very easy how ever I do not understand 
> Scala, I am having trouble debugging the following exception.
> Any help would be greatly appreciated.
> Happy New Year
> Andy
> java.lang.IllegalArgumentException: requirement failed: Param null__inputCol 
> does not belong to Stemmer_2f3aa96d-7919-4eaa-ad54-f7c620b92d1c.
>   at scala.Predef$.require(Predef.scala:233)
>   at org.apache.spark.ml.param.Params$class.shouldOwn(params.scala:557)
>   at org.apache.spark.ml.param.Params$class.set(params.scala:436)
>   at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:37)
>   at org.apache.spark.ml.param.Params$class.set(params.scala:422)
>   at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:37)
>   at 
> org.apache.spark.ml.UnaryTransformer.setInputCol(Transformer.scala:83)
>   at com.pws.xxx.ml.StemmerTest.test(StemmerTest.java:30)
> public class StemmerTest extends AbstractSparkTest {
> @Test
> public void test() {
> Stemmer stemmer = new Stemmer()
> .setInputCol("raw”) //line 30
> .setOutputCol("filtered");
> }
> }
> /**
>  * @ see 
> spark-1.5.1/mllib/src/main/scala/org/apache/spark/ml/feature/Tokenizer.scala
>  * @ see 
> https://chimpler.wordpress.com/2014/06/11/classifiying-documents-using-naive-bayes-on-apache-spark-mllib/
>  * @ see 
> http://www.tonytruong.net/movie-rating-prediction-with-apache-spark-and-hortonworks/
>  * 
>  * @author andrewdavidson
>  *
>  */
> public class Stemmer extends UnaryTransformer Stemmer> implements Serializable{
> static Logger logger = LoggerFactory.getLogger(Stemmer.class);
> private static final long serialVersionUID = 1L;
> private static final  ArrayType inputType = 
> DataTypes.createArrayType(DataTypes.StringType, true);
> private final String uid = Stemmer.class.getSimpleName() + "_" + 
> UUID.randomUUID().toString();
> 

[jira] [Comment Edited] (SPARK-12606) Scala/Java compatibility issue Re: how to extend java transformer from Scala UnaryTransformer ?

2017-04-04 Thread Guillaume Dardelet (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955191#comment-15955191
 ] 

Guillaume Dardelet edited comment on SPARK-12606 at 4/4/17 2:27 PM:


I had the same issue in Scala and I solved it by overloading the constructor so 
that it initialises the UID.

The error comes from the initialisation of the parameter "inputCol".
You get "null__inputCol" because when the parameter was initialised, your class 
didn't have a uid.

Therefore, instead of

{code:scala}
class Lemmatizer extends UnaryTransformer[String, String, Lemmatizer] {
  override val uid: String = Identifiable.randomUID("lemmatizer")
  protected def createTransformFunc: String) => String = ???
  protected def outputDataType: DataType = StringType
}
{code}

Do this:

class Lemmatizer(override val uid: String) extends UnaryTransformer[String, 
String, Lemmatizer] {
  def this() = this( Identifiable.randomUID("lemmatizer") )
  protected def createTransformFunc: String) => String = ???
  protected def outputDataType: DataType = StringType
}


was (Author: panoramix):
I had the same issue in Scala and I solved it by overloading the constructor so 
that it initialises the UID.

The error comes from the initialisation of the parameter "inputCol".
You get "null__inputCol" because when the parameter was initialised, your class 
didn't have a uid.

Therefore, instead of

class Lemmatizer extends UnaryTransformer[String, String, Lemmatizer] {
  override val uid: String = Identifiable.randomUID("lemmatizer")
  protected def createTransformFunc: String) => String = ???
  protected def outputDataType: DataType = StringType
}

Do this:

class Lemmatizer(override val uid: String) extends UnaryTransformer[String, 
String, Lemmatizer] {
  def this() = this( Identifiable.randomUID("lemmatizer") )
  protected def createTransformFunc: String) => String = ???
  protected def outputDataType: DataType = StringType
}

> Scala/Java compatibility issue Re: how to extend java transformer from Scala 
> UnaryTransformer ?
> ---
>
> Key: SPARK-12606
> URL: https://issues.apache.org/jira/browse/SPARK-12606
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 1.5.2
> Environment: Java 8, Mac OS, Spark-1.5.2
>Reporter: Andrew Davidson
>  Labels: transformers
>
> Hi Andy,
> I suspect that you hit the Scala/Java compatibility issue, I can also 
> reproduce this issue, so could you file a JIRA to track this issue?
> Yanbo
> 2016-01-02 3:38 GMT+08:00 Andy Davidson :
> I am trying to write a trivial transformer I use use in my pipeline. I am 
> using java and spark 1.5.2. It was suggested that I use the Tokenize.scala 
> class as an example. This should be very easy how ever I do not understand 
> Scala, I am having trouble debugging the following exception.
> Any help would be greatly appreciated.
> Happy New Year
> Andy
> java.lang.IllegalArgumentException: requirement failed: Param null__inputCol 
> does not belong to Stemmer_2f3aa96d-7919-4eaa-ad54-f7c620b92d1c.
>   at scala.Predef$.require(Predef.scala:233)
>   at org.apache.spark.ml.param.Params$class.shouldOwn(params.scala:557)
>   at org.apache.spark.ml.param.Params$class.set(params.scala:436)
>   at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:37)
>   at org.apache.spark.ml.param.Params$class.set(params.scala:422)
>   at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:37)
>   at 
> org.apache.spark.ml.UnaryTransformer.setInputCol(Transformer.scala:83)
>   at com.pws.xxx.ml.StemmerTest.test(StemmerTest.java:30)
> public class StemmerTest extends AbstractSparkTest {
> @Test
> public void test() {
> Stemmer stemmer = new Stemmer()
> .setInputCol("raw”) //line 30
> .setOutputCol("filtered");
> }
> }
> /**
>  * @ see 
> spark-1.5.1/mllib/src/main/scala/org/apache/spark/ml/feature/Tokenizer.scala
>  * @ see 
> https://chimpler.wordpress.com/2014/06/11/classifiying-documents-using-naive-bayes-on-apache-spark-mllib/
>  * @ see 
> http://www.tonytruong.net/movie-rating-prediction-with-apache-spark-and-hortonworks/
>  * 
>  * @author andrewdavidson
>  *
>  */
> public class Stemmer extends UnaryTransformer Stemmer> implements Serializable{
> static Logger logger = LoggerFactory.getLogger(Stemmer.class);
> private static final long serialVersionUID = 1L;
> private static final  ArrayType inputType = 
> DataTypes.createArrayType(DataTypes.StringType, true);
> private final String uid = Stemmer.class.getSimpleName() + "_" + 
> UUID.randomUUID().toString();
> @Override
> public String uid() 

[jira] [Comment Edited] (SPARK-12606) Scala/Java compatibility issue Re: how to extend java transformer from Scala UnaryTransformer ?

2017-04-04 Thread Guillaume Dardelet (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955191#comment-15955191
 ] 

Guillaume Dardelet edited comment on SPARK-12606 at 4/4/17 2:27 PM:


I had the same issue in Scala and I solved it by overloading the constructor so 
that it initialises the UID.

The error comes from the initialisation of the parameter "inputCol".
You get "null__inputCol" because when the parameter was initialised, your class 
didn't have a uid.

Therefore, instead of

{code}
class Lemmatizer extends UnaryTransformer[String, String, Lemmatizer] {
  override val uid: String = Identifiable.randomUID("lemmatizer")
  protected def createTransformFunc: String) => String = ???
  protected def outputDataType: DataType = StringType
}
{code}

Do this:

{code}
class Lemmatizer(override val uid: String) extends UnaryTransformer[String, 
String, Lemmatizer] {
  def this() = this( Identifiable.randomUID("lemmatizer") )
  protected def createTransformFunc: String) => String = ???
  protected def outputDataType: DataType = StringType
}
{code}


was (Author: panoramix):
I had the same issue in Scala and I solved it by overloading the constructor so 
that it initialises the UID.

The error comes from the initialisation of the parameter "inputCol".
You get "null__inputCol" because when the parameter was initialised, your class 
didn't have a uid.

Therefore, instead of

{code:scala}
class Lemmatizer extends UnaryTransformer[String, String, Lemmatizer] {
  override val uid: String = Identifiable.randomUID("lemmatizer")
  protected def createTransformFunc: String) => String = ???
  protected def outputDataType: DataType = StringType
}
{code}

Do this:

class Lemmatizer(override val uid: String) extends UnaryTransformer[String, 
String, Lemmatizer] {
  def this() = this( Identifiable.randomUID("lemmatizer") )
  protected def createTransformFunc: String) => String = ???
  protected def outputDataType: DataType = StringType
}

> Scala/Java compatibility issue Re: how to extend java transformer from Scala 
> UnaryTransformer ?
> ---
>
> Key: SPARK-12606
> URL: https://issues.apache.org/jira/browse/SPARK-12606
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 1.5.2
> Environment: Java 8, Mac OS, Spark-1.5.2
>Reporter: Andrew Davidson
>  Labels: transformers
>
> Hi Andy,
> I suspect that you hit the Scala/Java compatibility issue, I can also 
> reproduce this issue, so could you file a JIRA to track this issue?
> Yanbo
> 2016-01-02 3:38 GMT+08:00 Andy Davidson :
> I am trying to write a trivial transformer I use use in my pipeline. I am 
> using java and spark 1.5.2. It was suggested that I use the Tokenize.scala 
> class as an example. This should be very easy how ever I do not understand 
> Scala, I am having trouble debugging the following exception.
> Any help would be greatly appreciated.
> Happy New Year
> Andy
> java.lang.IllegalArgumentException: requirement failed: Param null__inputCol 
> does not belong to Stemmer_2f3aa96d-7919-4eaa-ad54-f7c620b92d1c.
>   at scala.Predef$.require(Predef.scala:233)
>   at org.apache.spark.ml.param.Params$class.shouldOwn(params.scala:557)
>   at org.apache.spark.ml.param.Params$class.set(params.scala:436)
>   at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:37)
>   at org.apache.spark.ml.param.Params$class.set(params.scala:422)
>   at org.apache.spark.ml.PipelineStage.set(Pipeline.scala:37)
>   at 
> org.apache.spark.ml.UnaryTransformer.setInputCol(Transformer.scala:83)
>   at com.pws.xxx.ml.StemmerTest.test(StemmerTest.java:30)
> public class StemmerTest extends AbstractSparkTest {
> @Test
> public void test() {
> Stemmer stemmer = new Stemmer()
> .setInputCol("raw”) //line 30
> .setOutputCol("filtered");
> }
> }
> /**
>  * @ see 
> spark-1.5.1/mllib/src/main/scala/org/apache/spark/ml/feature/Tokenizer.scala
>  * @ see 
> https://chimpler.wordpress.com/2014/06/11/classifiying-documents-using-naive-bayes-on-apache-spark-mllib/
>  * @ see 
> http://www.tonytruong.net/movie-rating-prediction-with-apache-spark-and-hortonworks/
>  * 
>  * @author andrewdavidson
>  *
>  */
> public class Stemmer extends UnaryTransformer Stemmer> implements Serializable{
> static Logger logger = LoggerFactory.getLogger(Stemmer.class);
> private static final long serialVersionUID = 1L;
> private static final  ArrayType inputType = 
> DataTypes.createArrayType(DataTypes.StringType, true);
> private final String uid = Stemmer.class.getSimpleName() + "_" + 
> UUID.randomUUID().toString();
>