OK, for me, time is not a problem. I was just worried about there was no
movement in those issues. I think they are good contributions. For
example, I have found no complex discretization algorithm in MLlib,
which is rare. My algorithm, a Spark implementation of the well-know
discretizer developed by Fayyad and Irani, could be considered a good
starting point for the discretization part. Furthermore, this is also
supported by two scientific articles.
Anyway, I uploaded these two algorithms as two different packages to
spark-packages.org, but I would like to contribute directly to MLlib. I
understand you have a lot of requests, and it is not possible to include
all the contributions made by the Spark community.
I'll be patient and ready to collaborate.
Thanks again
On 03/11/15 16:30, Jerry Lam wrote:
Sergio, you are not alone for sure. Check the RowSimilarity
implementation [SPARK-4823]. It has been there for 6 months. It is
very likely those which don't merge in the version of spark that it
was developed will never merged because spark changes quite
significantly from version to version if the algorithm depends a lot
of internal api.
On Tue, Nov 3, 2015 at 10:24 AM, Reynold Xin <r...@databricks.com
<mailto:r...@databricks.com>> wrote:
Sergio,
Usually it takes a lot of effort to get something merged into
Spark itself, especially for relatively new algorithms that might
not have established itself yet. I will leave it to mllib
maintainers to comment on the specifics of the individual
algorithms proposed here.
Just another general comment: we have been working on making
packages be as easy to use as possible for Spark users. Right now
it only requires a simple flag to pass to the spark-submit script
to include a package.
On Tue, Nov 3, 2015 at 2:49 AM, Sergio Ramírez <sramire...@ugr.es
<mailto:sramire...@ugr.es>> wrote:
Hello all:
I developed two packages for MLlib in March. These have been
also upload to the spark-packages repository. Associated to
these packages, I created two JIRA's threads and the
correspondent pull requests, which are listed below:
https://github.com/apache/spark/pull/5184
https://github.com/apache/spark/pull/5170
https://issues.apache.org/jira/browse/SPARK-6531
https://issues.apache.org/jira/browse/SPARK-6509
These remain unassigned in JIRA and unverified in GitHub.
Could anyone explain why are they in this state yet? Is it normal?
Thanks!
Sergio R.
--
Sergio Ramírez Gallego
Research group on Soft Computing and Intelligent Information
Systems,
Dept. Computer Science and Artificial Intelligence,
University of Granada, Granada, Spain.
Email: srami...@decsai.ugr.es <mailto:srami...@decsai.ugr.es>
Research Group URL: http://sci2s.ugr.es/
-------------------------------------------------------------------------
Este correo electrónico y, en su caso, cualquier fichero anexo
al mismo,
contiene información de carácter confidencial exclusivamente
dirigida a
su destinatario o destinatarios. Si no es vd. el destinatario
indicado,
queda notificado que la lectura, utilización, divulgación y/o
copia sin
autorización está prohibida en virtud de la legislación
vigente. En el
caso de haber recibido este correo electrónico por error, se ruega
notificar inmediatamente esta circunstancia mediante reenvío a la
dirección electrónica del remitente.
Evite imprimir este mensaje si no es estrictamente necesario.
This email and any file attached to it (when applicable)
contain(s)
confidential information that is exclusively addressed to its
recipient(s). If you are not the indicated recipient, you are
informed
that reading, using, disseminating and/or copying it without
authorisation is forbidden in accordance with the legislation
in effect.
If you have received this email by mistake, please immediately
notify
the sender of the situation by resending it to their email
address.
Avoid printing this message if it is not absolutely necessary.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
<mailto:dev-unsubscr...@spark.apache.org>
For additional commands, e-mail: dev-h...@spark.apache.org
<mailto:dev-h...@spark.apache.org>
--
Sergio Ramírez Gallego
Research group on Soft Computing and Intelligent Information Systems,
Dept. Computer Science and Artificial Intelligence,
University of Granada, Granada, Spain.
Email: srami...@decsai.ugr.es
Research Group URL: http://sci2s.ugr.es/
-------------------------------------------------------------------------
Este correo electrónico y, en su caso, cualquier fichero anexo al mismo,
contiene información de carácter confidencial exclusivamente dirigida a
su destinatario o destinatarios. Si no es vd. el destinatario indicado,
queda notificado que la lectura, utilización, divulgación y/o copia sin
autorización está prohibida en virtud de la legislación vigente. En el
caso de haber recibido este correo electrónico por error, se ruega
notificar inmediatamente esta circunstancia mediante reenvío a la
dirección electrónica del remitente.
Evite imprimir este mensaje si no es estrictamente necesario.
This email and any file attached to it (when applicable) contain(s)
confidential information that is exclusively addressed to its
recipient(s). If you are not the indicated recipient, you are informed
that reading, using, disseminating and/or copying it without
authorisation is forbidden in accordance with the legislation in effect.
If you have received this email by mistake, please immediately notify
the sender of the situation by resending it to their email address.
Avoid printing this message if it is not absolutely necessary.