MimeTypes detector detects text/plain content type of a PPT file
----------------------------------------------------------------

                 Key: TIKA-689
                 URL: https://issues.apache.org/jira/browse/TIKA-689
             Project: Tika
          Issue Type: Bug
          Components: mime
    Affects Versions: 0.9, 1.0
            Reporter: Joseph Vychtrle


If I create a PPT file like this 
{code}
import org.apache.poi.hslf.model.Slide;
import org.apache.poi.hslf.model.TextBox;
import org.apache.poi.hslf.usermodel.RichTextRun;
import org.apache.poi.hslf.usermodel.SlideShow;

        private void createPPTDocument(String from, File file) throws Exception 
{

              SlideShow ppt = new SlideShow();
              Slide slide = ppt.createSlide();
              TextBox shape = new TextBox();
              RichTextRun rt = shape.getTextRun().getRichTextRuns()[0];
              shape.setText(from);
              rt.setFontSize(7);
              slide.addShape(shape);
              shape.setAnchor(new java.awt.Rectangle(50, 50, 500, 300));  
//position of the text box in the slide
              slide.addShape(shape);
              FileOutputStream out = new FileOutputStream(file);
              ppt.write(out);
              out.close();
        }
{code}

And then :
{code}
MediaType mediaType = MediaType.parse(tika.detect(is));
{code}

It results in text/plain, MimeTypes detector used, the magic header simply 
doesn't match and then it falls back to text/plain.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to