The only thing i want from the ppt's is text and ignoring all graphical representations. I need the text to perform various nltk operations.
On Fri, May 30, 2014 at 11:54 PM, Alan Gauld <alan.ga...@btinternet.com> wrote: > On 30/05/14 10:41, Aaron Misquith wrote: > >> Like pypdf is used to convert pdf to text; is there any library that is >> used in converting .ppt files to .txt? Even some sample programs will be >> helpful. >> > > Bearing in mind that Powerpoint is intended for graphical presentations > the text elements are not necessarily going to be useful. Often Powerpoint > text is actually part of a graphic anyway. > > If the Powerpoint is just a set of bullet points (shame on the presenter!) > you probably don't want the text unless you can > also get the notes. I don't know of any libraries that can do that. > > But one option is that Open/Libre office can import Powerpoint and > apparently has a Python API which you could use to drive an export > from there. Just a thought... > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.flickr.com/photos/alangauldphotos > > _______________________________________________ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor >
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor