The upload to contrib / experimental was rejected by the ftpmasters with
the following comment:

> can you please explain how I can recreate the files *.tiktoken?  There
> seem to be some sources missing ...

The two files in question are 50k lines of ASCII text that seem to be
some kind of index / vocabulary, and I have no idea how they were
created.  I suspect they might be an artifact of the model training, but
do not know.  Anyone got a clue to spare on how these were created and
how to rebuild them?  If we lack the source to rebuild them, I currently
believe the whisper package will have to go to non-free, not contrib.
Any help to figure this out would be most appreciated.

-- 
Happy hacking
Petter Reinholdtsen

Reply via email to