Re: [go-nuts] Wordbreak and word extraction in Go?
thx Marcel, filed an issue here: https://github.com/golang/go/issues/17256 if you could outline a possible designe there, that would be great. On Thursday, September 22, 2016 at 5:51:48 PM UTC+2, Marcel van Lohuizen wrote: > > Hi Ingo, > > Thanks for your interest in x/text! Text segmentation is high on the > priority list for x/text, but not yet implemented. Indeed, x/text/cases > implements a (close) approximation of Annex #29 optimized for title casing, > but it is not the full thing. > > For now, if your main interest is word segmentation, your best bet is to > use github.com/blevesearch/segment. This is a decent implementation of > Annex #29 for word breaking. I've been talking with Marty to see if this > can be integrated with x/text even. > > But it would help to file an issue with exactly what you need. > > Please let me know if you have any other questions. > > Best regards, > > Marcel > > > On Wed, Sep 21, 2016 at 5:41 AM, Nigel Tao> wrote: > >> On Wed, Sep 21, 2016 at 7:34 AM, 'Ingo Oeser' via golang-nuts >> wrote: >> > I am pretty sure I am overlooking something in the repository >> https://godoc.org/golang.org/x/text but I cannot find something to split >> text into words according to the next Unicode word splitting algorithm. >> > >> > Has anyone examples or can point me to the right direction? Can anyone >> confirm that this is missing? If missing, I would like to file an issue >> against the text repository for this. >> >> I'd ask mpvl (CC'ed). >> > > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [go-nuts] Wordbreak and word extraction in Go?
Hi Ingo, Thanks for your interest in x/text! Text segmentation is high on the priority list for x/text, but not yet implemented. Indeed, x/text/cases implements a (close) approximation of Annex #29 optimized for title casing, but it is not the full thing. For now, if your main interest is word segmentation, your best bet is to use github.com/blevesearch/segment. This is a decent implementation of Annex #29 for word breaking. I've been talking with Marty to see if this can be integrated with x/text even. But it would help to file an issue with exactly what you need. Please let me know if you have any other questions. Best regards, Marcel On Wed, Sep 21, 2016 at 5:41 AM, Nigel Taowrote: > On Wed, Sep 21, 2016 at 7:34 AM, 'Ingo Oeser' via golang-nuts > wrote: > > I am pretty sure I am overlooking something in the repository > https://godoc.org/golang.org/x/text but I cannot find something to split > text into words according to the next Unicode word splitting algorithm. > > > > Has anyone examples or can point me to the right direction? Can anyone > confirm that this is missing? If missing, I would like to file an issue > against the text repository for this. > > I'd ask mpvl (CC'ed). > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [go-nuts] Wordbreak and word extraction in Go?
Thanks for the suggestion, but I am looking for an implementation of http://unicode.org/reports/tr29/ -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [go-nuts] Wordbreak and word extraction in Go?
How about strings.Fields? https://golang.org/pkg/strings/#Fields -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[go-nuts] Wordbreak and word extraction in Go?
Hi all, I am pretty sure I am overlooking something in the repository https://godoc.org/golang.org/x/text but I cannot find something to split text into words according to the next Unicode word splitting algorithm. Has anyone examples or can point me to the right direction? Can anyone confirm that this is missing? If missing, I would like to file an issue against the text repository for this. -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.