On Wed, Jan 24, 2018 at 11:06:01PM +0530, Rahul Barnwal wrote:
> Hi,
> I will like to contribute on the project "String Processing Utilities". I
> had built and acquainted myself with mlpack library to some extent earlier.
> 
> I am somewhat familiar with various string encoding methods in python. And
> i am willing to learn things in the process. Could you tell how to get
> started on this project and how to proceed on the same.

Hi Rahul,

Thanks for getting in touch.  I think that the string processing
utilities project will be an interesting one.  I would say that the
right way to get started on this project is to understand the problem.

One way to do that (and there are many others) might be to find a data
science problem that involves strings... perhaps, predicting the next
character in a sentence with an RNN, or, perhaps, performing TF-IDF (or
similar) on the Yelp reviews dataset and trying to predict the outcome
with logistic regression.  There are many tasks of this sort but the key
would be to choose one, then use mlpack to solve the problem.

When you do that, you can then identify the difficulties inherent in
using string-based datasets with mlpack, and this can help inform your
proposal on how these types of issues could be improved or resolved.

Of course, feel free to discuss ideas here on the mailing list.  Let me
know if I can clarify anything that I've written.

Thanks,

Ryan

-- 
Ryan Curtin    | "Avoid the planet Earth at all costs."
r...@ratml.org |   - The President
_______________________________________________
mlpack mailing list
mlpack@lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Reply via email to