Hi Madhav,

Thank you for sharing, yes maybe it's possible.

Although there is overlap, the two approaches are a bit different.

Do you have some documentation on the performance of the linear regression
approach?

I'm not sure how well it would perform for gender (binary) and other
attributes.

Ideally it would be desirable to have a way to capture all traits with
reasonable performance.

Best,

Anthony


On Tue, Jun 14, 2016 at 8:46 AM, Madhav Sharan <[email protected]> wrote:

> Hi Anthony, age prediction part of this enhancement looks very similar to
> https://issues.apache.org/jira/browse/TIKA-1988
>
> Do you see any way we can collaborate on this feature? I was thinking to
> build a TextFeatureParser which can parse multiple text based features like
> age.
>
> In our project for age prediction we built a classifier using linear
> regression which is available through a REST API ( more details in [0] ).
> We can configure multiple such REST APIs in TIKA through property file and
> then let the TextFeatureParser collate and present all the results.
>
> Let me know what you think about it. [1] has my code for TextFeatureParser,
> I will be giving a PR soon.
>
> CCing Indhu for any questions regarding [0]
>
> [0] https://github.com/USCDataScience/Age-Predictor
> [1] https://github.com/smadha/tika/tree/TIKA-1988
>
>
> --
> Madhav Sharan
>

Reply via email to