Hi,

if you have 30 classes with 10 samples per class, I'd say that's not an
optimal distribution.
Apart from that, you may use one of the text classifiers from
lucene-classification [1], is anything like this what you had in mind?
Alternatively you can also do things outside of Lucene and use Lucene only,
for example, to store vectors and find nearest neighbors.

Regards,
Tommaso

[1] :
https://lucene.apache.org/core/10_1_0/classification/org/apache/lucene/classification/package-summary.html

On Mon, 17 Feb 2025 at 16:15, Dmitri Geller <dmitri.gel...@gmail.com> wrote:

> Hi all, I would like to classify a sentence into one or two categories.
> I see this classification roughly this way:
>
> ```
> unknown:
>     example1
>     example2
>     ...
>     exampleN
>
> class1:
>     example1
>     example2
>     ...
>     exampleN
>
> class2:
>     example1
>     example2
>     ...
>     exampleN
>
> ...
>
> classN:
>     example1
>     example2
>     ...
>     exampleN
>
> ...
> ```
>
> There are about 25-30 classes.
> About 10-30 examples per class.
> One sentence can get one or two classes assigned
>
> As far as I understand: this can be done with Lucene Core, should be
> quite a standard functionality.
> Can you point me to a Java example for this?
>
> Thanks in advance!
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to