[jira] [Commented] (PDFBOX-4062) Fetch Color of Text using PDFBox

2018-01-12 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323677#comment-16323677
 ] 

Tilman Hausherr commented on PDFBOX-4062:
-

See here:
https://pdfbox.apache.org/support.html

> Fetch Color of Text using PDFBox
> 
>
> Key: PDFBOX-4062
> URL: https://issues.apache.org/jira/browse/PDFBOX-4062
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Vimal Kumar
>Priority: Blocker
> Attachments: b1.pdf
>
>
> I Need to Fetch the Color of Text in a PDF using pdfbox 2.0.0 , for the same 
> i have written java code as 
> {code}
> import java.io.ByteArrayOutputStream;
> import java.io.File;
> import java.io.IOException;
> import java.io.OutputStreamWriter;
> import java.io.Writer;
> import org.apache.pdfbox.contentstream.operator.color.SetNonStrokingColor;
> import org.apache.pdfbox.contentstream.operator.color.SetNonStrokingColorN;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetNonStrokingColorSpace;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetNonStrokingDeviceCMYKColor;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetNonStrokingDeviceGrayColor;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetNonStrokingDeviceRGBColor;
> import org.apache.pdfbox.contentstream.operator.color.SetStrokingColor;
> import org.apache.pdfbox.contentstream.operator.color.SetStrokingColorN;
> import org.apache.pdfbox.contentstream.operator.color.SetStrokingColorSpace;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetStrokingDeviceCMYKColor;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetStrokingDeviceGrayColor;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetStrokingDeviceRGBColor;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import org.apache.pdfbox.pdmodel.graphics.color.PDColor;
> import org.apache.pdfbox.pdmodel.graphics.state.RenderingMode;
> import org.apache.pdfbox.text.PDFTextStripper;
> import org.apache.pdfbox.text.TextPosition;
> /**
>  * This is an example on how to get the colors of text. Note that this will 
> not tell the background,
>  * and will only work properly if the text is not overwritten later, and only 
> if the text rendering
>  * modes are 0, 1 or 2. In the PDF 32000 specification, please read 9.3.6 
> "Text Rendering Mode" to
>  * know more. Mode 0 (FILL) is the default. Mode 1 (STROKE) will make glyphs 
> look "hollow". Mode 2
>  * (FILL_STROKE) will make glyphs look "fat".
>  *
>  * @author Ben Litchfield
>  * @author Tilman Hausherr
>  */
> public class PDF_Box_1 extends PDFTextStripper
> {
> /**
>  * Instantiate a new PDFTextStripper object.
>  *
>  * @throws IOException If there is an error loading the properties.
>  */
> public PDF_Box_1() throws IOException
> {
> addOperator(new SetStrokingColorSpace());
> addOperator(new SetNonStrokingColorSpace());
> addOperator(new SetStrokingDeviceCMYKColor());
> addOperator(new SetNonStrokingDeviceCMYKColor());
> addOperator(new SetNonStrokingDeviceRGBColor());
> addOperator(new SetStrokingDeviceRGBColor());
> addOperator(new SetNonStrokingDeviceGrayColor());
> addOperator(new SetStrokingDeviceGrayColor());
> addOperator(new SetStrokingColor());
> addOperator(new SetStrokingColorN());
> addOperator(new SetNonStrokingColor());
> addOperator(new SetNonStrokingColorN());
> }
> /**
>  * This will print the documents data.
>  *
>  * @param args The command line arguments.
>  *
>  * @throws IOException If there is an error parsing the document.
>  */
> public static void main(String[] args) throws IOException
> {
>
> try (PDDocument document = PDDocument.load(new 
> File("D://Vimal//New folder//ab.pdf")))
> {
> PDFTextStripper stripper = new PDF_Box_1();
> stripper.setSortByPosition(true);
> stripper.setStartPage(0);
> stripper.setEndPage(document.getNumberOfPages());
> stripper.getText(document);
> }
> 
> }
> @Override
> protected void processTextPosition(TextPosition text)
> {
> super.processTextPosition(text);
> PDColor strokingColor = getGraphicsState().getStrokingColor();
> PDColor nonStrokingColor = getGraphicsState().getNonStrokingColor();
> String unicode = text.getUnicode();
> RenderingMode renderingMode = 
> getGraphicsState().getTextState().getRenderingMode();
> System.out.println("Unicode:" + unicode);
> System.out.println("Rendering mode: " + renderingMode);
> System.out.println("Stroking color: " + strokingColor);
> 

[jira] [Commented] (PDFBOX-4062) Fetch Color of Text using PDFBox

2018-01-11 Thread Vimal Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323511#comment-16323511
 ] 

Vimal Kumar commented on PDFBOX-4062:
-

In Jira Type Section we dont have any option of "How" , can you provide a 
details on where should i file this then.

> Fetch Color of Text using PDFBox
> 
>
> Key: PDFBOX-4062
> URL: https://issues.apache.org/jira/browse/PDFBOX-4062
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Vimal Kumar
>Priority: Blocker
> Attachments: b1.pdf
>
>
> I Need to Fetch the Color of Text in a PDF using pdfbox 2.0.0 , for the same 
> i have written java code as 
> {code}
> import java.io.ByteArrayOutputStream;
> import java.io.File;
> import java.io.IOException;
> import java.io.OutputStreamWriter;
> import java.io.Writer;
> import org.apache.pdfbox.contentstream.operator.color.SetNonStrokingColor;
> import org.apache.pdfbox.contentstream.operator.color.SetNonStrokingColorN;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetNonStrokingColorSpace;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetNonStrokingDeviceCMYKColor;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetNonStrokingDeviceGrayColor;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetNonStrokingDeviceRGBColor;
> import org.apache.pdfbox.contentstream.operator.color.SetStrokingColor;
> import org.apache.pdfbox.contentstream.operator.color.SetStrokingColorN;
> import org.apache.pdfbox.contentstream.operator.color.SetStrokingColorSpace;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetStrokingDeviceCMYKColor;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetStrokingDeviceGrayColor;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetStrokingDeviceRGBColor;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import org.apache.pdfbox.pdmodel.graphics.color.PDColor;
> import org.apache.pdfbox.pdmodel.graphics.state.RenderingMode;
> import org.apache.pdfbox.text.PDFTextStripper;
> import org.apache.pdfbox.text.TextPosition;
> /**
>  * This is an example on how to get the colors of text. Note that this will 
> not tell the background,
>  * and will only work properly if the text is not overwritten later, and only 
> if the text rendering
>  * modes are 0, 1 or 2. In the PDF 32000 specification, please read 9.3.6 
> "Text Rendering Mode" to
>  * know more. Mode 0 (FILL) is the default. Mode 1 (STROKE) will make glyphs 
> look "hollow". Mode 2
>  * (FILL_STROKE) will make glyphs look "fat".
>  *
>  * @author Ben Litchfield
>  * @author Tilman Hausherr
>  */
> public class PDF_Box_1 extends PDFTextStripper
> {
> /**
>  * Instantiate a new PDFTextStripper object.
>  *
>  * @throws IOException If there is an error loading the properties.
>  */
> public PDF_Box_1() throws IOException
> {
> addOperator(new SetStrokingColorSpace());
> addOperator(new SetNonStrokingColorSpace());
> addOperator(new SetStrokingDeviceCMYKColor());
> addOperator(new SetNonStrokingDeviceCMYKColor());
> addOperator(new SetNonStrokingDeviceRGBColor());
> addOperator(new SetStrokingDeviceRGBColor());
> addOperator(new SetNonStrokingDeviceGrayColor());
> addOperator(new SetStrokingDeviceGrayColor());
> addOperator(new SetStrokingColor());
> addOperator(new SetStrokingColorN());
> addOperator(new SetNonStrokingColor());
> addOperator(new SetNonStrokingColorN());
> }
> /**
>  * This will print the documents data.
>  *
>  * @param args The command line arguments.
>  *
>  * @throws IOException If there is an error parsing the document.
>  */
> public static void main(String[] args) throws IOException
> {
>
> try (PDDocument document = PDDocument.load(new 
> File("D://Vimal//New folder//ab.pdf")))
> {
> PDFTextStripper stripper = new PDF_Box_1();
> stripper.setSortByPosition(true);
> stripper.setStartPage(0);
> stripper.setEndPage(document.getNumberOfPages());
> stripper.getText(document);
> }
> 
> }
> @Override
> protected void processTextPosition(TextPosition text)
> {
> super.processTextPosition(text);
> PDColor strokingColor = getGraphicsState().getStrokingColor();
> PDColor nonStrokingColor = getGraphicsState().getNonStrokingColor();
> String unicode = text.getUnicode();
> RenderingMode renderingMode = 
> getGraphicsState().getTextState().getRenderingMode();
> System.out.println("Unicode:" + unicode);
> System.out.println("Rendering mode: " + renderingMode);
> 

[jira] [Commented] (PDFBOX-4062) Fetch Color of Text using PDFBox

2018-01-11 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322729#comment-16322729
 ] 

Tilman Hausherr commented on PDFBOX-4062:
-

I intend to close this because I think this is a "how to" question. You 
selected this as a "bug". How is this a bug?

> Fetch Color of Text using PDFBox
> 
>
> Key: PDFBOX-4062
> URL: https://issues.apache.org/jira/browse/PDFBOX-4062
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Vimal Kumar
>Priority: Blocker
> Attachments: b1.pdf
>
>
> I Need to Fetch the Color of Text in a PDF using pdfbox 2.0.0 , for the same 
> i have written java code as 
> {code}
> import java.io.ByteArrayOutputStream;
> import java.io.File;
> import java.io.IOException;
> import java.io.OutputStreamWriter;
> import java.io.Writer;
> import org.apache.pdfbox.contentstream.operator.color.SetNonStrokingColor;
> import org.apache.pdfbox.contentstream.operator.color.SetNonStrokingColorN;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetNonStrokingColorSpace;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetNonStrokingDeviceCMYKColor;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetNonStrokingDeviceGrayColor;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetNonStrokingDeviceRGBColor;
> import org.apache.pdfbox.contentstream.operator.color.SetStrokingColor;
> import org.apache.pdfbox.contentstream.operator.color.SetStrokingColorN;
> import org.apache.pdfbox.contentstream.operator.color.SetStrokingColorSpace;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetStrokingDeviceCMYKColor;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetStrokingDeviceGrayColor;
> import 
> org.apache.pdfbox.contentstream.operator.color.SetStrokingDeviceRGBColor;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import org.apache.pdfbox.pdmodel.graphics.color.PDColor;
> import org.apache.pdfbox.pdmodel.graphics.state.RenderingMode;
> import org.apache.pdfbox.text.PDFTextStripper;
> import org.apache.pdfbox.text.TextPosition;
> /**
>  * This is an example on how to get the colors of text. Note that this will 
> not tell the background,
>  * and will only work properly if the text is not overwritten later, and only 
> if the text rendering
>  * modes are 0, 1 or 2. In the PDF 32000 specification, please read 9.3.6 
> "Text Rendering Mode" to
>  * know more. Mode 0 (FILL) is the default. Mode 1 (STROKE) will make glyphs 
> look "hollow". Mode 2
>  * (FILL_STROKE) will make glyphs look "fat".
>  *
>  * @author Ben Litchfield
>  * @author Tilman Hausherr
>  */
> public class PDF_Box_1 extends PDFTextStripper
> {
> /**
>  * Instantiate a new PDFTextStripper object.
>  *
>  * @throws IOException If there is an error loading the properties.
>  */
> public PDF_Box_1() throws IOException
> {
> addOperator(new SetStrokingColorSpace());
> addOperator(new SetNonStrokingColorSpace());
> addOperator(new SetStrokingDeviceCMYKColor());
> addOperator(new SetNonStrokingDeviceCMYKColor());
> addOperator(new SetNonStrokingDeviceRGBColor());
> addOperator(new SetStrokingDeviceRGBColor());
> addOperator(new SetNonStrokingDeviceGrayColor());
> addOperator(new SetStrokingDeviceGrayColor());
> addOperator(new SetStrokingColor());
> addOperator(new SetStrokingColorN());
> addOperator(new SetNonStrokingColor());
> addOperator(new SetNonStrokingColorN());
> }
> /**
>  * This will print the documents data.
>  *
>  * @param args The command line arguments.
>  *
>  * @throws IOException If there is an error parsing the document.
>  */
> public static void main(String[] args) throws IOException
> {
>
> try (PDDocument document = PDDocument.load(new 
> File("D://Vimal//New folder//ab.pdf")))
> {
> PDFTextStripper stripper = new PDF_Box_1();
> stripper.setSortByPosition(true);
> stripper.setStartPage(0);
> stripper.setEndPage(document.getNumberOfPages());
> stripper.getText(document);
> }
> 
> }
> @Override
> protected void processTextPosition(TextPosition text)
> {
> super.processTextPosition(text);
> PDColor strokingColor = getGraphicsState().getStrokingColor();
> PDColor nonStrokingColor = getGraphicsState().getNonStrokingColor();
> String unicode = text.getUnicode();
> RenderingMode renderingMode = 
> getGraphicsState().getTextState().getRenderingMode();
> System.out.println("Unicode:" + unicode);
> System.out.println("Rendering mode: " + renderingMode);
>