Yes, something like lucene-classification [1].
But, there are multiple classifiers in this package.
Which one is better suited ? (Imagine I collect more samples per class... about... 30-40 samples per class)
Any good Java examples using these classifiers?


Another question:
in case I want my classification to work "semantically":

For example:
For the class "crypto" I can have these samples:
- crypto
- bitcoin
- stablecoin
- blockchain

and in case the input text contains "Eterium" - what happens in this case, will it match "crypto" ?

I mean, the models in lucene-classification: as far as I understand - they do not have knowledge about semantic similarity between words, right?




On 2025/02/19 13:45:52 Tommaso Teofili wrote:
> Hi,
>
> if you have 30 classes with 10 samples per class, I'd say that's not an
> optimal distribution.
> Apart from that, you may use one of the text classifiers from
> lucene-classification [1], is anything like this what you had in mind?
> Alternatively you can also do things outside of Lucene and use Lucene only,
> for example, to store vectors and find nearest neighbors.
>
> Regards,
> Tommaso
>
> [1] :
> https://lucene.apache.org/core/10_1_0/classification/org/apache/lucene/classification/package-summary.html
>
> On Mon, 17 Feb 2025 at 16:15, Dmitri Geller <dm...@gmail.com> wrote:
>
> > Hi all, I would like to classify a sentence into one or two categories.
> > I see this classification roughly this way:
> >
> > ```
> > unknown:
> > example1
> > example2
> > ...
> > exampleN
> >
> > class1:
> > example1
> > example2
> > ...
> > exampleN
> >
> > class2:
> > example1
> > example2
> > ...
> > exampleN
> >
> > ...
> >
> > classN:
> > example1
> > example2
> > ...
> > exampleN
> >
> > ...
> > ```
> >
> > There are about 25-30 classes.
> > About 10-30 examples per class.
> > One sentence can get one or two classes assigned
> >
> > As far as I understand: this can be done with Lucene Core, should be
> > quite a standard functionality.
> > Can you point me to a Java example for this?
> >
> > Thanks in advance!
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to