Hi, Sorry about that.
What PDFBox version are you using? The current one is 2.0.7. The generic example is PrintTextLocations.java, and DrawPrintTextLocations.java is the same visually (see output: http://imgur.com/a/1awtu )
Which characters were you not able to retrieve the location? Please describe where it is, e.g. "top left", whatever, or please explain what you were expecting and missed.
Tilman Am 22.08.2017 um 17:44 schrieb 二川村田:
Hello I tried to get texts from below pdf. http://jpdb.nihs.go.jp/jp17e/000217651.pdf On first page, there were some characters that I could retrieve locations, but there were also characters that I couldn't. What is reason of this problem? ======================== my source to retrieve character's locations ======================== ===================== //class extends PDFTextStripper class PDFTextCordinateStripper extends PDFTextStripper { public List<TextPosition> list_text = new ArrayList<TextPosition>(); public PDFTextCordinateStripper() throws IOException { super(); } protected void processTextPosition(TextPosition text) { super.processTextPosition(text); list_text.add(text); } } ===================== // main(omited) PDFTextCordinateStripper stripper = new PDFTextCordinateStripper(); int len_page = doc.getNumberOfPages(); for (int ind = 1; ind <= len_page; ind++) { PDPage pg = doc.getPage(ind - 1); String str_page_num = "PageNum: " + ind; String str_page_size = "Width: " + pg_w + "\tHeight: " + pg_h; System.out.println(str_page_num + "\t" + str_page_size); stripper.list_text.clear(); stripper.setStartPage(ind); stripper.setEndPage(ind); stripper.getText(doc); String p_text = stripper.getText(doc); Iterator<String> it_str = Arrays.asList(p_text.split("")).iterator(); int ind_tp = 0; List<TextPosition> list_tp = stripper.list_text; int len_list_tp = list_tp.size(); while (it_str.hasNext()) { String ch = it_str.next(); String str_rec = "Text: " + ch; if (ind_tp < len_list_tp) { TextPosition tp = list_tp.get(ind_tp); if (ch.equals(tp.toString())){ str_rec += "\tx: " + tp.getX() + "\ty: " + tp.getY() + "\tw: " + tp.getWidth() + "\th: " + tp.getHeight() + "\tfont_size: " + tp.getFontSizeInPt(); ind_tp++; } } System.out.println(str_rec); } --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

