Re: [Factor-talk] OCR via docsplit in Factor
If you get lost in path land you can always take a break and use the /full/path/to/docsplit. On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote: Ah! Thanks, Joe- Great tip; should clear up the issue with which. I am indeed starting Factor in the Finder. I'll try adjusting the plist. Maybe that even has something to do with my docsplit puzzle. Since I can address commands like couchdb via a process, I should be able to invoke docsplit that way as well, even though htop shows me that docsplit itself spawns sub-processes, like poppler tesseract, to do its extraction work. Interesting. I'll go study the Mac dev doc you point to, see what I can glean from there. Back to the books, ~cw On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote: On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com wrote: Hi - Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg, still Version 0.97. Same issue with Factor's which: IN: scratchpad USE: tools.which IN: scratchpad couchdb which . f IN: scratchpad python which . /usr/bin/python - The trouble appears to be with reporting my PATH properly, via getenv: IN: scratchpad USE: environment IN: scratchpad PATH os-env . /usr/bin:/bin:/usr/sbin:/sbin IN: scratchpad USE: unix.ffi IN: scratchpad PATH getenv . /usr/bin:/bin:/usr/sbin:/sbin IN: scratchpad \ getenv see USING: alien.c-types alien.syntax ; IN: unix.ffi LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ; inline - Here's my actual PATH, as seen in the terminal: ➜ ~ git:(master) ✗ echo $PATH /usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin - whereby which correctly finds couchdb: ➜ ~ git:(master) ✗ which couchdb /usr/local/bin/couchdb So, Factor's which (et al.) doesn't search beyond /usr/bin:/bin:/usr/sbin:/sbin. Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me a clue as to how to rectify this short-sightedness via the libc getenv. This is probably a side issue to my docsplit quandary (but maybe not). Anyone see a way to report my actual PATH to which in Factor? My PATH is augmented in my .zshrc. I don't understand why the libc function doesn't read it. Odd, indeed! If you're starting Factor from the Finder, you're not going to get a PATH set from your .profile or other shell dotfiles, since UI apps are launched under the loginwindow session and not under any shell. To set environment variables for UI apps, try setting them in ~/.MacOSX/environment.plist: https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPRuntimeConfig/Articles/EnvironmentVars.html -Joe -- ~ Memento Amori -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk
[Factor-talk] Best way to split a fixed locations
Hi, I need to split a string at fixed locations (some of the locations may eventully be calculated, like with a lookup of '/', but at first approximation fixed locations are ok). I came with this example string and quotatiom: NAXIS =3 / number of data axes [ 0 swap 8 swap subseq ] [ 10 swap 30 swap subseq ] [ 33 swap 80 swap subseq ] tri [ [ 32 = ] trim ] tri@ This works, does the split and trim, but I am not too happy with the multiple swaps. I would liek to keep stack manipulation to the minimum for clarity. Any recommendation ? hb9duj -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk
Re: [Factor-talk] Best way to split a fixed locations
Maybe something like this: : subseqs ( indices seq -- subseqs ) [ subseq [ CHAR: \s = ] trim ] curry { } assocmap ; You can see it works by returning an array of subseqs: IN: scratchpad NAXIS =3 / number of data axes { { 0 8 } { 10 30 } { 33 80 } } swap subseqs . { NAXIS 3 number of data axes } On Sun, Feb 9, 2014 at 5:41 AM, Jean-Marc Lugrin hb9...@lugrin.ch wrote: Hi, I need to split a string at fixed locations (some of the locations may eventully be calculated, like with a lookup of '/', but at first approximation fixed locations are ok). I came with this example string and quotatiom: NAXIS =3 / number of data axes [ 0 swap 8 swap subseq ] [ 10 swap 30 swap subseq ] [ 33 swap 80 swap subseq ] tri [ [ 32 = ] trim ] tri@ This works, does the split and trim, but I am not too happy with the multiple swaps. I would liek to keep stack manipulation to the minimum for clarity. Any recommendation ? hb9duj -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk
Re: [Factor-talk] Best way to split a fixed locations
Or locals: :: foo ( seq -- a b c ) 0 8 seq subseq 10 30 seq subseq 33 80 seq subseq [ CHAR: \s = ] tri@ ; On Sun, Feb 9, 2014 at 7:06 AM, John Benediktsson mrj...@gmail.com wrote: Maybe something like this: : subseqs ( indices seq -- subseqs ) [ subseq [ CHAR: \s = ] trim ] curry { } assocmap ; You can see it works by returning an array of subseqs: IN: scratchpad NAXIS =3 / number of data axes { { 0 8 } { 10 30 } { 33 80 } } swap subseqs . { NAXIS 3 number of data axes } On Sun, Feb 9, 2014 at 5:41 AM, Jean-Marc Lugrin hb9...@lugrin.ch wrote: Hi, I need to split a string at fixed locations (some of the locations may eventully be calculated, like with a lookup of '/', but at first approximation fixed locations are ok). I came with this example string and quotatiom: NAXIS =3 / number of data axes [ 0 swap 8 swap subseq ] [ 10 swap 30 swap subseq ] [ 33 swap 80 swap subseq ] tri [ [ 32 = ] trim ] tri@ This works, does the split and trim, but I am not too happy with the multiple swaps. I would liek to keep stack manipulation to the minimum for clarity. Any recommendation ? hb9duj -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk
Re: [Factor-talk] Best way to split a fixed locations
Hi, you can get rid of most of the repetition by defining a word that operates on lists instead of using tri/tri@, for example: : subseqs ( seq indices -- subseqs ) swap [ subseq ] curry { } assocmap ; NAXIS =3 / number of data axes { { 0 8 } { 10 30 } { 33 80 } } subseqs Also, [ 32 = ] trim is maybe more readable written as [ CHAR: space = ] trim, or if you also want to trim tabs and \n\r, the library defines [ blank? ] trim So for example, you can use: NAXIS =3 / number of data axes { { 0 8 } { 10 30 } { 33 80 } } subseqs [ [ blank? ] trim ] map first3 Note that the question you are asking if farily low level, so you might find better methods to extract the info you want (for example split-harvest from splitting.extras could be useful in your case) Cheers, Jon On Sun, Feb 9, 2014 at 2:41 PM, Jean-Marc Lugrin hb9...@lugrin.ch wrote: Hi, I need to split a string at fixed locations (some of the locations may eventully be calculated, like with a lookup of '/', but at first approximation fixed locations are ok). I came with this example string and quotatiom: NAXIS =3 / number of data axes [ 0 swap 8 swap subseq ] [ 10 swap 30 swap subseq ] [ 33 swap 80 swap subseq ] tri [ [ 32 = ] trim ] tri@ This works, does the split and trim, but I am not too happy with the multiple swaps. I would liek to keep stack manipulation to the minimum for clarity. Any recommendation ? hb9duj -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk
Re: [Factor-talk] Best way to split a fixed locations
2014-02-09 14:41 GMT+01:00 Jean-Marc Lugrin hb9...@lugrin.ch: Hi, I need to split a string at fixed locations (some of the locations may eventully be calculated, like with a lookup of '/', but at first approximation fixed locations are ok). I came with this example string and quotatiom: NAXIS =3 / number of data axes [ 0 swap 8 swap subseq ] [ 10 swap 30 swap subseq ] [ 33 swap 80 swap subseq ] tri [ [ 32 = ] trim ] tri@ Some more variants for you to consider: Using fry: : subseqs ( indices seq -- subseqs ) '[ first2 _ subseq [ blank? ] trim ] map ; Using with: : subseqs ( seq indices -- subseqs ) [ first2 rot subseq [ blank? ] trim ] with map ; Also Jon's idea of finding higher-level splitting words to work with is really good. E.g. IN: NAXIS =3 / number of data axes IN: /= split [ [ blank? ] trim ] map . { NAXIS 3 number of data axes } -- mvh/best regards Björn Lindqvist -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk
Re: [Factor-talk] OCR via docsplit in Factor
Hi John- Beg pardon, I should have mentioned earlier that since docsplit plants a .txt file in the target pdf's directory on its own, with no other output, I had gone the route you suggested, but to no avail, i.e., docsplit text --no-clean -l path run-process drop In the terminal, cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf works fine. The surprise is that, in the listener, the phrase: cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf run-process . - returns with status 0, but leaves no file. Ditto using /full/path/to/docsplit in the command. The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves to /usr/local/Cellar/ruby/2.1.0/bin/docsplit (installed w/ homebrew). There I find this ruby script: require 'rubygems' version = = 0 if ARGV.first str = ARGV.first str = str.dup.force_encoding(BINARY) if str.respond_to? :force_encoding if str =~ /\A_(.*)_\z/ version = $1 ARGV.shift end end gem 'docsplit', version load Gem.bin_path('docsplit', 'docsplit', version) If I manage to decipher this, I'll try to translate it in Factor, and invoke docsplit that way. That should keep me busy for a while. Worth a try, though I know zip about ruby. Once past this boondoggle, I already have Factor code that walks the tree collates the files. Thanks! ~cw On Sun, Feb 9, 2014 at 4:31 AM, John Benediktsson mrj...@gmail.com wrote: If you get lost in path land you can always take a break and use the /full/path/to/docsplit. On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote: Ah! Thanks, Joe- Great tip; should clear up the issue with which. I am indeed starting Factor in the Finder. I'll try adjusting the plist. Maybe that even has something to do with my docsplit puzzle. Since I can address commands like couchdb via a process, I should be able to invoke docsplit that way as well, even though htop shows me that docsplit itself spawns sub-processes, like poppler tesseract, to do its extraction work. Interesting. I'll go study the Mac dev doc you point to, see what I can glean from there. Back to the books, ~cw On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote: On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com wrote: Hi - Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg, still Version 0.97. Same issue with Factor's which: IN: scratchpad USE: tools.which IN: scratchpad couchdb which . f IN: scratchpad python which . /usr/bin/python - The trouble appears to be with reporting my PATH properly, via getenv: IN: scratchpad USE: environment IN: scratchpad PATH os-env . /usr/bin:/bin:/usr/sbin:/sbin IN: scratchpad USE: unix.ffi IN: scratchpad PATH getenv . /usr/bin:/bin:/usr/sbin:/sbin IN: scratchpad \ getenv see USING: alien.c-types alien.syntax ; IN: unix.ffi LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ; inline - Here's my actual PATH, as seen in the terminal: ➜ ~ git:(master) ✗ echo $PATH /usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin - whereby which correctly finds couchdb: ➜ ~ git:(master) ✗ which couchdb /usr/local/bin/couchdb So, Factor's which (et al.) doesn't search beyond /usr/bin:/bin:/usr/sbin:/sbin. Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me a clue as to how to rectify this short-sightedness via the libc getenv. This is probably a side issue to my docsplit quandary (but maybe not). Anyone see a way to report my actual PATH to which in Factor? My PATH is augmented in my .zshrc. I don't understand why the libc function doesn't read it. Odd, indeed! If you're starting Factor from the Finder, you're not going to get a PATH set from your .profile or other shell dotfiles, since UI apps are launched under the loginwindow session and not under any shell. To set environment variables for UI apps, try setting them in ~/.MacOSX/environment.plist: https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPRuntimeConfig/Articles/EnvironmentVars.html -Joe -- *~ Memento Amori* -- *~ Memento Amori* -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk
Re: [Factor-talk] OCR via docsplit in Factor
It's probably easiest to specify the full path to the file, like I did in my previous message. Combined with the full path to the docsplit binary/link (for your particular problem), it should theoretically work fine: /full/path/to/docsplit text --no-clean -l chi_sim /path/to/1_long_gu/long_gu001.pdf try-process On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com wrote: Hi John- Beg pardon, I should have mentioned earlier that since docsplit plants a .txt file in the target pdf's directory on its own, with no other output, I had gone the route you suggested, but to no avail, i.e., docsplit text --no-clean -l path run-process drop In the terminal, cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf works fine. The surprise is that, in the listener, the phrase: cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf run-process . - returns with status 0, but leaves no file. Ditto using /full/path/to/docsplit in the command. The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves to /usr/local/Cellar/ruby/2.1.0/bin/docsplit (installed w/ homebrew). There I find this ruby script: require 'rubygems' version = = 0 if ARGV.first str = ARGV.first str = str.dup.force_encoding(BINARY) if str.respond_to? :force_encoding if str =~ /\A_(.*)_\z/ version = $1 ARGV.shift end end gem 'docsplit', version load Gem.bin_path('docsplit', 'docsplit', version) If I manage to decipher this, I'll try to translate it in Factor, and invoke docsplit that way. That should keep me busy for a while. Worth a try, though I know zip about ruby. Once past this boondoggle, I already have Factor code that walks the tree collates the files. Thanks! ~cw On Sun, Feb 9, 2014 at 4:31 AM, John Benediktsson mrj...@gmail.com wrote: If you get lost in path land you can always take a break and use the /full/path/to/docsplit. On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote: Ah! Thanks, Joe- Great tip; should clear up the issue with which. I am indeed starting Factor in the Finder. I'll try adjusting the plist. Maybe that even has something to do with my docsplit puzzle. Since I can address commands like couchdb via a process, I should be able to invoke docsplit that way as well, even though htop shows me that docsplit itself spawns sub-processes, like poppler tesseract, to do its extraction work. Interesting. I'll go study the Mac dev doc you point to, see what I can glean from there. Back to the books, ~cw On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote: On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com wrote: Hi - Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg, still Version 0.97. Same issue with Factor's which: IN: scratchpad USE: tools.which IN: scratchpad couchdb which . f IN: scratchpad python which . /usr/bin/python - The trouble appears to be with reporting my PATH properly, via getenv: IN: scratchpad USE: environment IN: scratchpad PATH os-env . /usr/bin:/bin:/usr/sbin:/sbin IN: scratchpad USE: unix.ffi IN: scratchpad PATH getenv . /usr/bin:/bin:/usr/sbin:/sbin IN: scratchpad \ getenv see USING: alien.c-types alien.syntax ; IN: unix.ffi LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ; inline - Here's my actual PATH, as seen in the terminal: ➜ ~ git:(master) ✗ echo $PATH /usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin - whereby which correctly finds couchdb: ➜ ~ git:(master) ✗ which couchdb /usr/local/bin/couchdb So, Factor's which (et al.) doesn't search beyond /usr/bin:/bin:/usr/sbin:/sbin. Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me a clue as to how to rectify this short-sightedness via the libc getenv. This is probably a side issue to my docsplit quandary (but maybe not). Anyone see a way to report my actual PATH to which in Factor? My PATH is augmented in my .zshrc. I don't understand why the libc function doesn't read it. Odd, indeed! If you're starting Factor from the Finder, you're not going to get a PATH set from your .profile or other shell dotfiles, since UI apps are launched under the loginwindow session and not under any shell. To set environment variables for UI apps, try setting them in ~/.MacOSX/environment.plist: https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPRuntimeConfig/Articles/EnvironmentVars.html -Joe -- ~ Memento Amori -- ~ Memento Amori -- Managing the Performance of Cloud-Based Applications Take advantage of what the
Re: [Factor-talk] OCR via docsplit in Factor
As a follow-up, from Factor you can use `with-directory-files` (http://docs.factorcode.org/content/word-with-directory-files,io.directories.html) and `absolute-path` (http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html) to get full paths to the files in some directory: ``` IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ] with-directory-files /home/alex/factor/core/generic /home/alex/factor/core/parser /home/alex/factor/core/sorting [etc] ``` On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com wrote: It's probably easiest to specify the full path to the file, like I did in my previous message. Combined with the full path to the docsplit binary/link (for your particular problem), it should theoretically work fine: /full/path/to/docsplit text --no-clean -l chi_sim /path/to/1_long_gu/long_gu001.pdf try-process On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com wrote: Hi John- Beg pardon, I should have mentioned earlier that since docsplit plants a .txt file in the target pdf's directory on its own, with no other output, I had gone the route you suggested, but to no avail, i.e., docsplit text --no-clean -l path run-process drop In the terminal, cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf works fine. The surprise is that, in the listener, the phrase: cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf run-process . - returns with status 0, but leaves no file. Ditto using /full/path/to/docsplit in the command. The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves to /usr/local/Cellar/ruby/2.1.0/bin/docsplit (installed w/ homebrew). There I find this ruby script: require 'rubygems' version = = 0 if ARGV.first str = ARGV.first str = str.dup.force_encoding(BINARY) if str.respond_to? :force_encoding if str =~ /\A_(.*)_\z/ version = $1 ARGV.shift end end gem 'docsplit', version load Gem.bin_path('docsplit', 'docsplit', version) If I manage to decipher this, I'll try to translate it in Factor, and invoke docsplit that way. That should keep me busy for a while. Worth a try, though I know zip about ruby. Once past this boondoggle, I already have Factor code that walks the tree collates the files. Thanks! ~cw On Sun, Feb 9, 2014 at 4:31 AM, John Benediktsson mrj...@gmail.com wrote: If you get lost in path land you can always take a break and use the /full/path/to/docsplit. On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote: Ah! Thanks, Joe- Great tip; should clear up the issue with which. I am indeed starting Factor in the Finder. I'll try adjusting the plist. Maybe that even has something to do with my docsplit puzzle. Since I can address commands like couchdb via a process, I should be able to invoke docsplit that way as well, even though htop shows me that docsplit itself spawns sub-processes, like poppler tesseract, to do its extraction work. Interesting. I'll go study the Mac dev doc you point to, see what I can glean from there. Back to the books, ~cw On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote: On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com wrote: Hi - Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg, still Version 0.97. Same issue with Factor's which: IN: scratchpad USE: tools.which IN: scratchpad couchdb which . f IN: scratchpad python which . /usr/bin/python - The trouble appears to be with reporting my PATH properly, via getenv: IN: scratchpad USE: environment IN: scratchpad PATH os-env . /usr/bin:/bin:/usr/sbin:/sbin IN: scratchpad USE: unix.ffi IN: scratchpad PATH getenv . /usr/bin:/bin:/usr/sbin:/sbin IN: scratchpad \ getenv see USING: alien.c-types alien.syntax ; IN: unix.ffi LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ; inline - Here's my actual PATH, as seen in the terminal: ➜ ~ git:(master) ✗ echo $PATH /usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin - whereby which correctly finds couchdb: ➜ ~ git:(master) ✗ which couchdb /usr/local/bin/couchdb So, Factor's which (et al.) doesn't search beyond /usr/bin:/bin:/usr/sbin:/sbin. Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me a clue as to how to rectify this short-sightedness via the libc getenv. This is probably a side issue to my docsplit quandary (but maybe not). Anyone see a way to report my actual PATH to which in Factor? My PATH is augmented in my .zshrc. I don't understand why the libc function doesn't read it. Odd, indeed! If you're starting Factor from the Finder, you're not going to get a PATH
Re: [Factor-talk] OCR via docsplit in Factor
Hi Alex- Thanks, I did try /full/path/to/docsplit text --no-clean -l chi_sim /path/to/1_long_gu/long_gu001.pdf try-process using both the symlink and the resolved executable: /usr/local/opt/ruby/bin/docsplit /usr/local/Cellar/ruby/2.1.0/bin/docsplit but still no response, still status 0. A lightbulb went on, and I set a duplicate symlink in /usr/bin/docsplit (where Factor's which can find it) straight to /usr/local/Cellar/ruby/2.1.0/bin/docsplit: IN: scratchpad docsplit which . /usr/bin/docsplit -ok, but still no success with anything in io.launcher. Oy! I see on the web that this problem calling docsplit isn't confined to Factor. Help calls appear in Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797 and stackoverflow re pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit. Let me dig around some more; this sticky wicket must have a workaround... I'll dig around some more. ~cw On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.com wrote: As a follow-up, from Factor you can use `with-directory-files` ( http://docs.factorcode.org/content/word-with-directory-files,io.directories.html ) and `absolute-path` (http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html) to get full paths to the files in some directory: ``` IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ] with-directory-files /home/alex/factor/core/generic /home/alex/factor/core/parser /home/alex/factor/core/sorting [etc] ``` On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com wrote: It's probably easiest to specify the full path to the file, like I did in my previous message. Combined with the full path to the docsplit binary/link (for your particular problem), it should theoretically work fine: /full/path/to/docsplit text --no-clean -l chi_sim /path/to/1_long_gu/long_gu001.pdf try-process On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com wrote: Hi John- Beg pardon, I should have mentioned earlier that since docsplit plants a .txt file in the target pdf's directory on its own, with no other output, I had gone the route you suggested, but to no avail, i.e., docsplit text --no-clean -l path run-process drop In the terminal, cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf works fine. The surprise is that, in the listener, the phrase: cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf run-process . - returns with status 0, but leaves no file. Ditto using /full/path/to/docsplit in the command. The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves to /usr/local/Cellar/ruby/2.1.0/bin/docsplit (installed w/ homebrew). There I find this ruby script: require 'rubygems' version = = 0 if ARGV.first str = ARGV.first str = str.dup.force_encoding(BINARY) if str.respond_to? :force_encoding if str =~ /\A_(.*)_\z/ version = $1 ARGV.shift end end gem 'docsplit', version load Gem.bin_path('docsplit', 'docsplit', version) If I manage to decipher this, I'll try to translate it in Factor, and invoke docsplit that way. That should keep me busy for a while. Worth a try, though I know zip about ruby. Once past this boondoggle, I already have Factor code that walks the tree collates the files. Thanks! ~cw On Sun, Feb 9, 2014 at 4:31 AM, John Benediktsson mrj...@gmail.com wrote: If you get lost in path land you can always take a break and use the /full/path/to/docsplit. On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote: Ah! Thanks, Joe- Great tip; should clear up the issue with which. I am indeed starting Factor in the Finder. I'll try adjusting the plist. Maybe that even has something to do with my docsplit puzzle. Since I can address commands like couchdb via a process, I should be able to invoke docsplit that way as well, even though htop shows me that docsplit itself spawns sub-processes, like poppler tesseract, to do its extraction work. Interesting. I'll go study the Mac dev doc you point to, see what I can glean from there. Back to the books, ~cw On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote: On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com wrote: Hi - Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg, still Version 0.97. Same issue with Factor's which: IN: scratchpad USE: tools.which IN: scratchpad couchdb which . f IN: scratchpad python which . /usr/bin/python - The trouble appears to be with reporting my PATH properly, via getenv: IN: scratchpad USE: environment IN: scratchpad PATH os-env . /usr/bin:/bin:/usr/sbin:/sbin IN: scratchpad USE: unix.ffi IN: scratchpad PATH getenv . /usr/bin:/bin:/usr/sbin:/sbin IN:
Re: [Factor-talk] OCR via docsplit in Factor
Strange. Well, not actually strange, since many programs aren't great about return codes...but still! I decided to re-enact the issue by removing /usr/local/bin (where my docsplit was installed) from my PATH, starting Factor, and trying it out. Looks like docsplit is dumping the txt file in the current working directory: IN: scratchpad docsplit which . f IN: scratchpad docsplit text --no-clean -l eng /tmp/thesis.pdf run-process status . 255 IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf run-process status . 0 IN: scratchpad /tmp/thesis.txt exists? . f IN: scratchpad thesis.txt exists? . t Seems as though you need to tell Factor to run in another working directory: IN: scratchpad /tmp [ /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf run-process status . ] with-directory 0 IN: scratchpad /tmp/thesis.txt exists? . t By the way, turns out you can set the `environment` slot of an io.launcher process, so I was thinking maybe that would help, but... IN: scratchpad process docsplit text --no-clean -l eng /tmp/thesis.pdf command /tmp/stdout.txt stdout +stdout+ stderr { { PATH /usr/local/bin } } environment run-process status . 1 IN: scratchpad /tmp/stdout.txt utf8 file-contents print sh: 1: pdftotext: not found Damn. No dice. Looks like you'll have to fix the PATH issue on the system itself. Anyway, hope that helps. (P.S.: Charles, if you're getting this message again, it's because I think GMail might've screwed up the reply behavior and didn't send this to the list, so I'm re-sending it.) On Sun, Feb 9, 2014 at 3:13 PM, CW Alston cwalsto...@gmail.com wrote: Hi Alex- Thanks, I did try /full/path/to/docsplit text --no-clean -l chi_sim /path/to/1_long_gu/long_gu001.pdf try-process using both the symlink and the resolved executable: /usr/local/opt/ruby/bin/docsplit /usr/local/Cellar/ruby/2.1.0/bin/docsplit but still no response, still status 0. A lightbulb went on, and I set a duplicate symlink in /usr/bin/docsplit (where Factor's which can find it) straight to /usr/local/Cellar/ruby/2.1.0/bin/docsplit: IN: scratchpad docsplit which . /usr/bin/docsplit -ok, but still no success with anything in io.launcher. Oy! I see on the web that this problem calling docsplit isn't confined to Factor. Help calls appear in Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797 and stackoverflow re pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit. Let me dig around some more; this sticky wicket must have a workaround... I'll dig around some more. ~cw On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.com wrote: As a follow-up, from Factor you can use `with-directory-files` ( http://docs.factorcode.org/content/word-with-directory-files,io.directories.html ) and `absolute-path` (http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html) to get full paths to the files in some directory: ``` IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ] with-directory-files /home/alex/factor/core/generic /home/alex/factor/core/parser /home/alex/factor/core/sorting [etc] ``` On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com wrote: It's probably easiest to specify the full path to the file, like I did in my previous message. Combined with the full path to the docsplit binary/link (for your particular problem), it should theoretically work fine: /full/path/to/docsplit text --no-clean -l chi_sim /path/to/1_long_gu/long_gu001.pdf try-process On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com wrote: Hi John- Beg pardon, I should have mentioned earlier that since docsplit plants a .txt file in the target pdf's directory on its own, with no other output, I had gone the route you suggested, but to no avail, i.e., docsplit text --no-clean -l path run-process drop In the terminal, cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf works fine. The surprise is that, in the listener, the phrase: cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf run-process . - returns with status 0, but leaves no file. Ditto using /full/path/to/docsplit in the command. The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves to /usr/local/Cellar/ruby/2.1.0/bin/docsplit (installed w/ homebrew). There I find this ruby script: require 'rubygems' version = = 0 if ARGV.first str = ARGV.first str = str.dup.force_encoding(BINARY) if str.respond_to? :force_encoding if str =~ /\A_(.*)_\z/ version = $1 ARGV.shift end end gem 'docsplit', version load Gem.bin_path('docsplit', 'docsplit', version) If I manage to decipher this, I'll try to translate it in Factor, and invoke docsplit that way. That should keep me busy for a while. Worth a
Re: [Factor-talk] OCR via docsplit in Factor
Yeah, Alex- I would have thought the cd in my compound command string would take care of he current directory issue. There's another thread about this problemhttp://www.programmingrelief.com/3213645/Docsplit-Works-Fine-In-Command-Line-But-Ignores-Code-In-Ruby-Script%3Fthat finds docsplit returning files in the root directory - on my system no files are winding up there. Let me see what I can do w/ your path/environment suggestions. Gonna be another long night... Thanks much, ~cw On Sun, Feb 9, 2014 at 4:08 PM, Alex Vondrak ajvond...@gmail.com wrote: Strange. Well, not actually strange, since many programs aren't great about return codes...but still! I decided to re-enact the issue by removing /usr/local/bin (where my docsplit was installed) from my PATH, starting Factor, and trying it out. Looks like docsplit is dumping the txt file in the current working directory: IN: scratchpad docsplit which . f IN: scratchpad docsplit text --no-clean -l eng /tmp/thesis.pdf run-process status . 255 IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf run-process status . 0 IN: scratchpad /tmp/thesis.txt exists? . f IN: scratchpad thesis.txt exists? . t Seems as though you need to tell Factor to run in another working directory: IN: scratchpad /tmp [ /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf run-process status . ] with-directory 0 IN: scratchpad /tmp/thesis.txt exists? . t By the way, turns out you can set the `environment` slot of an io.launcher process, so I was thinking maybe that would help, but... IN: scratchpad process docsplit text --no-clean -l eng /tmp/thesis.pdf command /tmp/stdout.txt stdout +stdout+ stderr { { PATH /usr/local/bin } } environment run-process status . 1 IN: scratchpad /tmp/stdout.txt utf8 file-contents print sh: 1: pdftotext: not found Damn. No dice. Looks like you'll have to fix the PATH issue on the system itself. Anyway, hope that helps. (P.S.: Charles, if you're getting this message again, it's because I think GMail might've screwed up the reply behavior and didn't send this to the list, so I'm re-sending it.) On Sun, Feb 9, 2014 at 3:13 PM, CW Alston cwalsto...@gmail.com wrote: Hi Alex- Thanks, I did try /full/path/to/docsplit text --no-clean -l chi_sim /path/to/1_long_gu/long_gu001.pdf try-process using both the symlink and the resolved executable: /usr/local/opt/ruby/bin/docsplit /usr/local/Cellar/ruby/2.1.0/bin/docsplit but still no response, still status 0. A lightbulb went on, and I set a duplicate symlink in /usr/bin/docsplit (where Factor's which can find it) straight to /usr/local/Cellar/ruby/2.1.0/bin/docsplit: IN: scratchpad docsplit which . /usr/bin/docsplit -ok, but still no success with anything in io.launcher. Oy! I see on the web that this problem calling docsplit isn't confined to Factor. Help calls appear in Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797 and stackoverflow re pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit. Let me dig around some more; this sticky wicket must have a workaround... I'll dig around some more. ~cw On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.com wrote: As a follow-up, from Factor you can use `with-directory-files` ( http://docs.factorcode.org/content/word-with-directory-files,io.directories.html ) and `absolute-path` (http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html ) to get full paths to the files in some directory: ``` IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ] with-directory-files /home/alex/factor/core/generic /home/alex/factor/core/parser /home/alex/factor/core/sorting [etc] ``` On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com wrote: It's probably easiest to specify the full path to the file, like I did in my previous message. Combined with the full path to the docsplit binary/link (for your particular problem), it should theoretically work fine: /full/path/to/docsplit text --no-clean -l chi_sim /path/to/1_long_gu/long_gu001.pdf try-process On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com wrote: Hi John- Beg pardon, I should have mentioned earlier that since docsplit plants a .txt file in the target pdf's directory on its own, with no other output, I had gone the route you suggested, but to no avail, i.e., docsplit text --no-clean -l path run-process drop In the terminal, cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf works fine. The surprise is that, in the listener, the phrase: cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf run-process . - returns with status 0, but leaves no file. Ditto using /full/path/to/docsplit in the command. The docsplit bin alias
Re: [Factor-talk] OCR via docsplit in Factor
Thing is, `cd` isn't a binary that Factor can execute in a process. It's just a shell command implemented by bash or zsh or whatever you use. Same with the semicolon syntax, for that matter. You might try to finagle something like IN: scratchpad { sh -c cd /tmp ; pwd } utf8 [ contents . ] with-process-reader /tmp\n Not sure how the PATH stuff will work out with that, though. You could also try just using the `-o` flag to docsplit. Again, deliberately messing up my PATH so Factor can't run docsplit directly: IN: scratchpad docsplit which . f IN: scratchpad /tmp/thesis.pdf exists? . t IN: scratchpad /tmp/thesis.txt exists? . f IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf -o /tmp try-process IN: scratchpad /tmp/thesis.txt exists? . t On Sun, Feb 9, 2014 at 5:02 PM, CW Alston cwalsto...@gmail.com wrote: Yeah, Alex- I would have thought the cd in my compound command string would take care of he current directory issue. There's another thread about this problemhttp://www.programmingrelief.com/3213645/Docsplit-Works-Fine-In-Command-Line-But-Ignores-Code-In-Ruby-Script%3Fthat finds docsplit returning files in the root directory - on my system no files are winding up there. Let me see what I can do w/ your path/environment suggestions. Gonna be another long night... Thanks much, ~cw On Sun, Feb 9, 2014 at 4:08 PM, Alex Vondrak ajvond...@gmail.com wrote: Strange. Well, not actually strange, since many programs aren't great about return codes...but still! I decided to re-enact the issue by removing /usr/local/bin (where my docsplit was installed) from my PATH, starting Factor, and trying it out. Looks like docsplit is dumping the txt file in the current working directory: IN: scratchpad docsplit which . f IN: scratchpad docsplit text --no-clean -l eng /tmp/thesis.pdf run-process status . 255 IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf run-process status . 0 IN: scratchpad /tmp/thesis.txt exists? . f IN: scratchpad thesis.txt exists? . t Seems as though you need to tell Factor to run in another working directory: IN: scratchpad /tmp [ /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf run-process status . ] with-directory 0 IN: scratchpad /tmp/thesis.txt exists? . t By the way, turns out you can set the `environment` slot of an io.launcher process, so I was thinking maybe that would help, but... IN: scratchpad process docsplit text --no-clean -l eng /tmp/thesis.pdf command /tmp/stdout.txt stdout +stdout+ stderr { { PATH /usr/local/bin } } environment run-process status . 1 IN: scratchpad /tmp/stdout.txt utf8 file-contents print sh: 1: pdftotext: not found Damn. No dice. Looks like you'll have to fix the PATH issue on the system itself. Anyway, hope that helps. (P.S.: Charles, if you're getting this message again, it's because I think GMail might've screwed up the reply behavior and didn't send this to the list, so I'm re-sending it.) On Sun, Feb 9, 2014 at 3:13 PM, CW Alston cwalsto...@gmail.com wrote: Hi Alex- Thanks, I did try /full/path/to/docsplit text --no-clean -l chi_sim /path/to/1_long_gu/long_gu001.pdf try-process using both the symlink and the resolved executable: /usr/local/opt/ruby/bin/docsplit /usr/local/Cellar/ruby/2.1.0/bin/docsplit but still no response, still status 0. A lightbulb went on, and I set a duplicate symlink in /usr/bin/docsplit (where Factor's which can find it) straight to /usr/local/Cellar/ruby/2.1.0/bin/docsplit: IN: scratchpad docsplit which . /usr/bin/docsplit -ok, but still no success with anything in io.launcher. Oy! I see on the web that this problem calling docsplit isn't confined to Factor. Help calls appear in Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797 and stackoverflow re pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit. Let me dig around some more; this sticky wicket must have a workaround... I'll dig around some more. ~cw On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.comwrote: As a follow-up, from Factor you can use `with-directory-files` ( http://docs.factorcode.org/content/word-with-directory-files,io.directories.html ) and `absolute-path` ( http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html ) to get full paths to the files in some directory: ``` IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ] with-directory-files /home/alex/factor/core/generic /home/alex/factor/core/parser /home/alex/factor/core/sorting [etc] ``` On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com wrote: It's probably easiest to specify the full path to the file, like I did in my previous message. Combined with the full path to the docsplit binary/link (for your particular problem), it should
Re: [Factor-talk] installation packages for CI?
2014-02-08 11:56 GMT+01:00 Gabriel Kerneis gabr...@kerneis.info: On Fri, Feb 07, 2014 at 06:44:32PM -0500, Andrew Pennebaker wrote: If we met users half way, presenting .deb's, .rpm's, maybe a ppa repo, that would be a great start. As a first step, I recommend using https://build.opensuse.org/ It is slightly openSUSE centered, but makes it easy to check that your basic rpm deb build for ubuntu, debian, fedora and openSUSE. A while ago I created Ubuntu packages for Factor and put them in my PPA here: https://launchpad.net/~bjourne/+archive/factor Someone has also packaged Factor for Arch: https://aur.archlinux.org/packages/factor/ A big problem is that Factors build doesn't make it easy to install system-wide in a typical Linux setup. So you have to add lots of hacks to the build to add support for prefixed installation, with binary and support files split in different directories. It's a lot of work and hard to keep it in sync with Factors github repository. A smaller problem is that some Factor words wants to overwrite the image and file and write in directories relative to the executable file which obviously is problematic on Linux where writes outside of $HOME is forbidden. But I think most Linux users can live with that limitation. So to address the problematic build I've created an alternate build process which you are welcome to check out here: https://github.com/slavapestov/factor/pull/934 It's written using waf which I think is great for complicated build-processes like Factor's. With the branch, the build command becomes python waf.py configure --prefix=/opt/factor2 build sudo python waf.py install. Using that as their base, I believe someone knowledgable of their distro's build system could very easily package Factor. Then to actually get distros to put Factor in their repos would entail reopening tickets like this: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=471925 Plus, Debian has some bureaucratic rules on packages they ship. Like requiring a man page. -- mvh/best regards Björn Lindqvist -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk
Re: [Factor-talk] installation packages for CI?
I can add that ppa to my source list, but when I `apt-get install factor`, I get a program for factoring compound numbers, not the Factor programming language. Could we rename the ppa to work around the name collision? On Sun, Feb 9, 2014 at 9:59 PM, Björn Lindqvist bjou...@gmail.com wrote: 2014-02-08 11:56 GMT+01:00 Gabriel Kerneis gabr...@kerneis.info: On Fri, Feb 07, 2014 at 06:44:32PM -0500, Andrew Pennebaker wrote: If we met users half way, presenting .deb's, .rpm's, maybe a ppa repo, that would be a great start. As a first step, I recommend using https://build.opensuse.org/ It is slightly openSUSE centered, but makes it easy to check that your basic rpm deb build for ubuntu, debian, fedora and openSUSE. A while ago I created Ubuntu packages for Factor and put them in my PPA here: https://launchpad.net/~bjourne/+archive/factor Someone has also packaged Factor for Arch: https://aur.archlinux.org/packages/factor/ A big problem is that Factors build doesn't make it easy to install system-wide in a typical Linux setup. So you have to add lots of hacks to the build to add support for prefixed installation, with binary and support files split in different directories. It's a lot of work and hard to keep it in sync with Factors github repository. A smaller problem is that some Factor words wants to overwrite the image and file and write in directories relative to the executable file which obviously is problematic on Linux where writes outside of $HOME is forbidden. But I think most Linux users can live with that limitation. So to address the problematic build I've created an alternate build process which you are welcome to check out here: https://github.com/slavapestov/factor/pull/934 It's written using waf which I think is great for complicated build-processes like Factor's. With the branch, the build command becomes python waf.py configure --prefix=/opt/factor2 build sudo python waf.py install. Using that as their base, I believe someone knowledgable of their distro's build system could very easily package Factor. Then to actually get distros to put Factor in their repos would entail reopening tickets like this: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=471925 Plus, Debian has some bureaucratic rules on packages they ship. Like requiring a man page. -- mvh/best regards Björn Lindqvist -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk -- Cheers, Andrew Pennebaker www.yellosoft.us -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk
Re: [Factor-talk] OCR via docsplit in Factor
Lord love a duck, Alex - I didn't realize that builtins like `cd` are 'existentially' different than utilities like `cat` - (I only speak pidgin unix; bites me often). Thanks for the heads-up. Okay... I'll try moving|copying my target directory into my home folder, to obviate the need for any cd'ing (I hope), pass docsplit an array of pdfs and flags; or maybe have docsplit iterate over a tmp file containing lines like: chi_sim long_gu001.pdf eng long_gu002.pdf eng long_gu003.pdf ... Probably have to do this in a script. Never a dull moment. ~cw On Sun, Feb 9, 2014 at 6:34 PM, Alex Vondrak ajvond...@gmail.com wrote: Thing is, `cd` isn't a binary that Factor can execute in a process. It's just a shell command implemented by bash or zsh or whatever you use. Same with the semicolon syntax, for that matter. You might try to finagle something like IN: scratchpad { sh -c cd /tmp ; pwd } utf8 [ contents . ] with-process-reader /tmp\n Not sure how the PATH stuff will work out with that, though. You could also try just using the `-o` flag to docsplit. Again, deliberately messing up my PATH so Factor can't run docsplit directly: IN: scratchpad docsplit which . f IN: scratchpad /tmp/thesis.pdf exists? . t IN: scratchpad /tmp/thesis.txt exists? . f IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf -o /tmp try-process IN: scratchpad /tmp/thesis.txt exists? . t On Sun, Feb 9, 2014 at 5:02 PM, CW Alston cwalsto...@gmail.com wrote: Yeah, Alex- I would have thought the cd in my compound command string would take care of he current directory issue. There's another thread about this problemhttp://www.programmingrelief.com/3213645/Docsplit-Works-Fine-In-Command-Line-But-Ignores-Code-In-Ruby-Script%3Fthat finds docsplit returning files in the root directory - on my system no files are winding up there. Let me see what I can do w/ your path/environment suggestions. Gonna be another long night... Thanks much, ~cw On Sun, Feb 9, 2014 at 4:08 PM, Alex Vondrak ajvond...@gmail.com wrote: Strange. Well, not actually strange, since many programs aren't great about return codes...but still! I decided to re-enact the issue by removing /usr/local/bin (where my docsplit was installed) from my PATH, starting Factor, and trying it out. Looks like docsplit is dumping the txt file in the current working directory: IN: scratchpad docsplit which . f IN: scratchpad docsplit text --no-clean -l eng /tmp/thesis.pdf run-process status . 255 IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf run-process status . 0 IN: scratchpad /tmp/thesis.txt exists? . f IN: scratchpad thesis.txt exists? . t Seems as though you need to tell Factor to run in another working directory: IN: scratchpad /tmp [ /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf run-process status . ] with-directory 0 IN: scratchpad /tmp/thesis.txt exists? . t By the way, turns out you can set the `environment` slot of an io.launcher process, so I was thinking maybe that would help, but... IN: scratchpad process docsplit text --no-clean -l eng /tmp/thesis.pdf command /tmp/stdout.txt stdout +stdout+ stderr { { PATH /usr/local/bin } } environment run-process status . 1 IN: scratchpad /tmp/stdout.txt utf8 file-contents print sh: 1: pdftotext: not found Damn. No dice. Looks like you'll have to fix the PATH issue on the system itself. Anyway, hope that helps. (P.S.: Charles, if you're getting this message again, it's because I think GMail might've screwed up the reply behavior and didn't send this to the list, so I'm re-sending it.) On Sun, Feb 9, 2014 at 3:13 PM, CW Alston cwalsto...@gmail.com wrote: Hi Alex- Thanks, I did try /full/path/to/docsplit text --no-clean -l chi_sim /path/to/1_long_gu/long_gu001.pdf try-process using both the symlink and the resolved executable: /usr/local/opt/ruby/bin/docsplit /usr/local/Cellar/ruby/2.1.0/bin/docsplit but still no response, still status 0. A lightbulb went on, and I set a duplicate symlink in /usr/bin/docsplit (where Factor's which can find it) straight to /usr/local/Cellar/ruby/2.1.0/bin/docsplit: IN: scratchpad docsplit which . /usr/bin/docsplit -ok, but still no success with anything in io.launcher. Oy! I see on the web that this problem calling docsplit isn't confined to Factor. Help calls appear in Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797 and stackoverflow re pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit. Let me dig around some more; this sticky wicket must have a workaround... I'll dig around some more. ~cw On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.comwrote: As a follow-up, from Factor you can use `with-directory-files` (