Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread John Benediktsson
If you get lost in path land you can always take a break and use the 
/full/path/to/docsplit.

 On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote:
 
 Ah! Thanks, Joe- 
 Great tip; should clear up the issue with which. I am indeed starting 
 Factor in the Finder. I'll try adjusting the plist.
 Maybe that even has something to do with my docsplit puzzle. Since I can 
 address commands like couchdb 
 via a process, I should be able to invoke docsplit that way as well, even 
 though htop shows me that docsplit
 itself spawns sub-processes, like poppler  tesseract, to do its extraction 
 work. Interesting.
 
 I'll go study the Mac dev doc you point to,  see what I can glean from there.
 
 Back to the books,
 ~cw
 
 
 
 
 
 On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote:
 On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com wrote:
 Hi -
 Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg,
 still Version 0.97. Same issue with Factor's which:
 
 IN: scratchpad USE: tools.which
 IN: scratchpad couchdb which .
 f
 
 IN: scratchpad python which .
 /usr/bin/python
 
 - The trouble appears to be with reporting my PATH properly, via getenv:
 
 IN: scratchpad USE: environment
 IN: scratchpad PATH os-env .
 /usr/bin:/bin:/usr/sbin:/sbin
 
 IN: scratchpad USE: unix.ffi
 IN: scratchpad PATH getenv .
 /usr/bin:/bin:/usr/sbin:/sbin
 
 IN: scratchpad \ getenv see
 USING: alien.c-types alien.syntax ;
 IN: unix.ffi
 LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ;
 inline
 
 - Here's my actual PATH, as seen in the terminal:
 
 ➜  ~ git:(master) ✗ echo $PATH
 /usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin
 
 - whereby which correctly finds couchdb:
 
 ➜  ~ git:(master) ✗ which couchdb
 /usr/local/bin/couchdb
 
 So, Factor's which (et al.) doesn't search beyond 
 /usr/bin:/bin:/usr/sbin:/sbin.
 
 Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me
 a clue as to how to rectify this short-sightedness via the libc getenv.
 
 This is probably a side issue to my docsplit quandary (but maybe not).
 Anyone see a way to report my actual PATH to which in Factor? My PATH is
 augmented in my .zshrc. I don't understand why the libc function doesn't 
 read it. Odd, indeed!
 
 If you're starting Factor from the Finder, you're not going to get a PATH 
 set from your .profile or other shell dotfiles, since UI apps are launched 
 under the loginwindow session and not under any shell. To set environment 
 variables for UI apps, try setting them in ~/.MacOSX/environment.plist:
 
  
 https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPRuntimeConfig/Articles/EnvironmentVars.html
 
 -Joe
 
 
 
 -- 
 ~ Memento Amori
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


[Factor-talk] Best way to split a fixed locations

2014-02-09 Thread Jean-Marc Lugrin
Hi,
I need to split a string at fixed locations (some of the locations may
eventully be calculated, like with a lookup of '/', but at first
approximation fixed locations are ok).

I came with this example string and quotatiom:

NAXIS   =3 / number of data axes
 
[  0 swap 8 swap subseq ] [ 10 swap 30 swap subseq ] [ 33 swap 80 swap
subseq ] tri [ [ 32 = ] trim ] tri@

This works, does the split and trim, but I am not too happy with the
multiple swaps. I would liek to keep stack manipulation to the minimum for
clarity.
Any recommendation ?
hb9duj
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] Best way to split a fixed locations

2014-02-09 Thread John Benediktsson
Maybe something like this:

: subseqs ( indices seq -- subseqs )
[ subseq [ CHAR: \s = ] trim ] curry { } assocmap ;

You can see it works by returning an array of subseqs:

IN: scratchpad NAXIS   =3 / number of data axes
 
{ { 0 8 } { 10 30 } { 33 80 } } swap subseqs .
{ NAXIS 3 number of data axes }


On Sun, Feb 9, 2014 at 5:41 AM, Jean-Marc Lugrin hb9...@lugrin.ch wrote:

 Hi,
 I need to split a string at fixed locations (some of the locations may
 eventully be calculated, like with a lookup of '/', but at first
 approximation fixed locations are ok).

 I came with this example string and quotatiom:

 NAXIS   =3 / number of data axes

 [  0 swap 8 swap subseq ] [ 10 swap 30 swap subseq ] [ 33 swap 80 swap
 subseq ] tri [ [ 32 = ] trim ] tri@

 This works, does the split and trim, but I am not too happy with the
 multiple swaps. I would liek to keep stack manipulation to the minimum for
 clarity.
 Any recommendation ?
 hb9duj



 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.

 http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
 ___
 Factor-talk mailing list
 Factor-talk@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/factor-talk


--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] Best way to split a fixed locations

2014-02-09 Thread John Benediktsson
Or locals:

:: foo ( seq -- a b c )
0 8 seq subseq
10 30 seq subseq
33 80 seq subseq
[ CHAR: \s = ] tri@ ;


On Sun, Feb 9, 2014 at 7:06 AM, John Benediktsson mrj...@gmail.com wrote:

 Maybe something like this:

 : subseqs ( indices seq -- subseqs )
 [ subseq [ CHAR: \s = ] trim ] curry { } assocmap ;

 You can see it works by returning an array of subseqs:

 IN: scratchpad NAXIS   =3 / number of data axes
  
 { { 0 8 } { 10 30 } { 33 80 } } swap subseqs .
 { NAXIS 3 number of data axes }


 On Sun, Feb 9, 2014 at 5:41 AM, Jean-Marc Lugrin hb9...@lugrin.ch wrote:

 Hi,
 I need to split a string at fixed locations (some of the locations may
 eventully be calculated, like with a lookup of '/', but at first
 approximation fixed locations are ok).

 I came with this example string and quotatiom:

 NAXIS   =3 / number of data axes

 [  0 swap 8 swap subseq ] [ 10 swap 30 swap subseq ] [ 33 swap 80 swap
 subseq ] tri [ [ 32 = ] trim ] tri@

 This works, does the split and trim, but I am not too happy with the
 multiple swaps. I would liek to keep stack manipulation to the minimum for
 clarity.
 Any recommendation ?
 hb9duj



 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.

 http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
 ___
 Factor-talk mailing list
 Factor-talk@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/factor-talk



--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] Best way to split a fixed locations

2014-02-09 Thread Jon Harper
Hi,
you can get rid of most of the repetition by defining a word that operates
on lists instead of using tri/tri@, for example:

: subseqs ( seq indices -- subseqs )
swap [ subseq ] curry { } assocmap ;

NAXIS   =3 / number of data
axes
{ { 0 8 } { 10 30 } { 33 80 } } subseqs

Also, [ 32 = ] trim is maybe more readable written as [ CHAR: space = ]
trim, or if you also want to trim tabs and \n\r, the library defines [
blank? ] trim

So for example, you can use:
NAXIS   =3 / number of data
axes
{ { 0 8 } { 10 30 } { 33 80 } } subseqs [ [ blank? ] trim ] map first3

Note that the question you are asking if farily low level, so you might
find better methods to extract the info you want (for example split-harvest
from splitting.extras could be useful in your case)

Cheers,
Jon


On Sun, Feb 9, 2014 at 2:41 PM, Jean-Marc Lugrin hb9...@lugrin.ch wrote:

 Hi,
 I need to split a string at fixed locations (some of the locations may
 eventully be calculated, like with a lookup of '/', but at first
 approximation fixed locations are ok).

 I came with this example string and quotatiom:

 NAXIS   =3 / number of data axes

 [  0 swap 8 swap subseq ] [ 10 swap 30 swap subseq ] [ 33 swap 80 swap
 subseq ] tri [ [ 32 = ] trim ] tri@

 This works, does the split and trim, but I am not too happy with the
 multiple swaps. I would liek to keep stack manipulation to the minimum for
 clarity.
 Any recommendation ?
 hb9duj



 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.

 http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
 ___
 Factor-talk mailing list
 Factor-talk@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/factor-talk


--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] Best way to split a fixed locations

2014-02-09 Thread Björn Lindqvist
2014-02-09 14:41 GMT+01:00 Jean-Marc Lugrin hb9...@lugrin.ch:
 Hi,
 I need to split a string at fixed locations (some of the locations may
 eventully be calculated, like with a lookup of '/', but at first
 approximation fixed locations are ok).

 I came with this example string and quotatiom:

 NAXIS   =3 / number of data axes
 
 [  0 swap 8 swap subseq ] [ 10 swap 30 swap subseq ] [ 33 swap 80 swap
 subseq ] tri [ [ 32 = ] trim ] tri@

Some more variants for you to consider:

Using fry:

: subseqs ( indices seq -- subseqs )
'[ first2 _ subseq [ blank? ] trim ] map ;

Using with:

: subseqs ( seq indices -- subseqs )
[ first2 rot subseq [ blank? ] trim ] with map ;

Also Jon's idea of finding higher-level splitting words to work with
is really good. E.g.

IN: NAXIS   =3 / number of data axes
IN: /= split [ [ blank? ] trim ] map .
{ NAXIS 3 number of data axes }


-- 
mvh/best regards Björn Lindqvist

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread CW Alston
Hi John-
Beg pardon, I should have mentioned earlier that since docsplit plants a
.txt file in the target pdf's
directory on its own, with no other output, I had gone the route you
suggested, but to no avail, i.e.,

docsplit text --no-clean -l path run-process drop

In  the terminal,  cd /path/to/1_long_gu ; docsplit text --no-clean -l
chi_sim long_gu001.pdf
works fine. The surprise is that, in the listener, the phrase:

cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim
long_gu001.pdf run-process .

- returns with status 0, but leaves no file. Ditto using /full/path/to/docsplit
in the command.

The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves
to /usr/local/Cellar/ruby/2.1.0/bin/docsplit
(installed w/ homebrew). There I find this ruby script:

require 'rubygems'

version = = 0

if ARGV.first
  str = ARGV.first
  str = str.dup.force_encoding(BINARY) if str.respond_to? :force_encoding
  if str =~ /\A_(.*)_\z/
version = $1
ARGV.shift
  end
end

gem 'docsplit', version
load Gem.bin_path('docsplit', 'docsplit', version)

If I manage to decipher this, I'll try to translate it in Factor, and
invoke docsplit that way.
That should keep me busy for a while. Worth a try, though I know zip about
ruby. Once past
this boondoggle, I already have Factor code that walks the tree  collates
the files.

Thanks!
~cw




On Sun, Feb 9, 2014 at 4:31 AM, John Benediktsson mrj...@gmail.com wrote:

 If you get lost in path land you can always take a break and use the
 /full/path/to/docsplit.

 On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote:

 Ah! Thanks, Joe-
 Great tip; should clear up the issue with which. I am indeed starting
 Factor in the Finder. I'll try adjusting the plist.
 Maybe that even has something to do with my docsplit puzzle. Since I can
 address commands like couchdb
 via a process, I should be able to invoke docsplit that way as well,
 even though htop shows me that docsplit
 itself spawns sub-processes, like poppler  tesseract, to do its
 extraction work. Interesting.

 I'll go study the Mac dev doc you point to,  see what I can glean from
 there.

 Back to the books,
 ~cw





 On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote:

 On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi -
 Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg,
 still Version 0.97. Same issue with Factor's which:

 IN: scratchpad USE: tools.which
 IN: scratchpad couchdb which .
 f

 IN: scratchpad python which .
 /usr/bin/python

 - The trouble appears to be with reporting my PATH properly, via getenv:

 IN: scratchpad USE: environment
 IN: scratchpad PATH os-env .
 /usr/bin:/bin:/usr/sbin:/sbin

 IN: scratchpad USE: unix.ffi
 IN: scratchpad PATH getenv .
 /usr/bin:/bin:/usr/sbin:/sbin

 IN: scratchpad \ getenv see
 USING: alien.c-types alien.syntax ;
 IN: unix.ffi
 LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ;
 inline

 - Here's my actual PATH, as seen in the terminal:

 ➜  ~ git:(master) ✗ echo $PATH

 /usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin

 - whereby which correctly finds couchdb:

 ➜  ~ git:(master) ✗ which couchdb
 /usr/local/bin/couchdb

 So, Factor's which (et al.) doesn't search beyond
 /usr/bin:/bin:/usr/sbin:/sbin.

 Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me
 a clue as to how to rectify this short-sightedness via the libc getenv.

 This is probably a side issue to my docsplit quandary (but maybe not).
 Anyone see a way to report my actual PATH to which in Factor? My PATH
 is
 augmented in my .zshrc. I don't understand why the libc function doesn't
 read it. Odd, indeed!

 If you're starting Factor from the Finder, you're not going to get a
 PATH set from your .profile or other shell dotfiles, since UI apps are
 launched under the loginwindow session and not under any shell. To set
 environment variables for UI apps, try setting them in
 ~/.MacOSX/environment.plist:


 https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPRuntimeConfig/Articles/EnvironmentVars.html

 -Joe




 --
 *~ Memento Amori*




-- 
*~ Memento Amori*
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread Alex Vondrak
It's probably easiest to specify the full path to the file, like I did
in my previous message.  Combined with the full path to the docsplit
binary/link (for your particular problem), it should theoretically
work fine:

/full/path/to/docsplit text --no-clean -l chi_sim
/path/to/1_long_gu/long_gu001.pdf try-process

On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com wrote:
 Hi John-
 Beg pardon, I should have mentioned earlier that since docsplit plants a
 .txt file in the target pdf's
 directory on its own, with no other output, I had gone the route you
 suggested, but to no avail, i.e.,

 docsplit text --no-clean -l path run-process drop

 In  the terminal,  cd /path/to/1_long_gu ; docsplit text --no-clean -l
 chi_sim long_gu001.pdf
 works fine. The surprise is that, in the listener, the phrase:

 cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf
 run-process .

 - returns with status 0, but leaves no file. Ditto using
 /full/path/to/docsplit in the command.

 The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves to
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit
 (installed w/ homebrew). There I find this ruby script:

 require 'rubygems'

 version = = 0

 if ARGV.first
   str = ARGV.first
   str = str.dup.force_encoding(BINARY) if str.respond_to? :force_encoding
   if str =~ /\A_(.*)_\z/
 version = $1
 ARGV.shift
   end
 end

 gem 'docsplit', version
 load Gem.bin_path('docsplit', 'docsplit', version)

 If I manage to decipher this, I'll try to translate it in Factor, and invoke
 docsplit that way.
 That should keep me busy for a while. Worth a try, though I know zip about
 ruby. Once past
 this boondoggle, I already have Factor code that walks the tree  collates
 the files.

 Thanks!
 ~cw




 On Sun, Feb 9, 2014 at 4:31 AM, John Benediktsson mrj...@gmail.com wrote:

 If you get lost in path land you can always take a break and use the
 /full/path/to/docsplit.

 On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote:

 Ah! Thanks, Joe-
 Great tip; should clear up the issue with which. I am indeed starting
 Factor in the Finder. I'll try adjusting the plist.
 Maybe that even has something to do with my docsplit puzzle. Since I can
 address commands like couchdb
 via a process, I should be able to invoke docsplit that way as well,
 even though htop shows me that docsplit
 itself spawns sub-processes, like poppler  tesseract, to do its
 extraction work. Interesting.

 I'll go study the Mac dev doc you point to,  see what I can glean from
 there.

 Back to the books,
 ~cw





 On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote:

 On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi -
 Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg,
 still Version 0.97. Same issue with Factor's which:

 IN: scratchpad USE: tools.which
 IN: scratchpad couchdb which .
 f

 IN: scratchpad python which .
 /usr/bin/python

 - The trouble appears to be with reporting my PATH properly, via getenv:

 IN: scratchpad USE: environment
 IN: scratchpad PATH os-env .
 /usr/bin:/bin:/usr/sbin:/sbin

 IN: scratchpad USE: unix.ffi
 IN: scratchpad PATH getenv .
 /usr/bin:/bin:/usr/sbin:/sbin

 IN: scratchpad \ getenv see
 USING: alien.c-types alien.syntax ;
 IN: unix.ffi
 LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ;
 inline

 - Here's my actual PATH, as seen in the terminal:

 ➜  ~ git:(master) ✗ echo $PATH

 /usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin

 - whereby which correctly finds couchdb:

 ➜  ~ git:(master) ✗ which couchdb
 /usr/local/bin/couchdb

 So, Factor's which (et al.) doesn't search beyond
 /usr/bin:/bin:/usr/sbin:/sbin.

 Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me
 a clue as to how to rectify this short-sightedness via the libc getenv.

 This is probably a side issue to my docsplit quandary (but maybe not).
 Anyone see a way to report my actual PATH to which in Factor? My PATH
 is
 augmented in my .zshrc. I don't understand why the libc function doesn't
 read it. Odd, indeed!

 If you're starting Factor from the Finder, you're not going to get a PATH
 set from your .profile or other shell dotfiles, since UI apps are launched
 under the loginwindow session and not under any shell. To set environment
 variables for UI apps, try setting them in ~/.MacOSX/environment.plist:


 https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPRuntimeConfig/Articles/EnvironmentVars.html

 -Joe




 --
 ~ Memento Amori




 --
 ~ Memento Amori

 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the 

Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread Alex Vondrak
As a follow-up, from Factor you can use `with-directory-files`
(http://docs.factorcode.org/content/word-with-directory-files,io.directories.html)
and `absolute-path`
(http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html)
to get full paths to the files in some directory:

```
IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ]
with-directory-files
/home/alex/factor/core/generic
/home/alex/factor/core/parser
/home/alex/factor/core/sorting
[etc]
```


On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com wrote:
 It's probably easiest to specify the full path to the file, like I did
 in my previous message.  Combined with the full path to the docsplit
 binary/link (for your particular problem), it should theoretically
 work fine:

 /full/path/to/docsplit text --no-clean -l chi_sim
 /path/to/1_long_gu/long_gu001.pdf try-process

 On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com wrote:
 Hi John-
 Beg pardon, I should have mentioned earlier that since docsplit plants a
 .txt file in the target pdf's
 directory on its own, with no other output, I had gone the route you
 suggested, but to no avail, i.e.,

 docsplit text --no-clean -l path run-process drop

 In  the terminal,  cd /path/to/1_long_gu ; docsplit text --no-clean -l
 chi_sim long_gu001.pdf
 works fine. The surprise is that, in the listener, the phrase:

 cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim long_gu001.pdf
 run-process .

 - returns with status 0, but leaves no file. Ditto using
 /full/path/to/docsplit in the command.

 The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves to
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit
 (installed w/ homebrew). There I find this ruby script:

 require 'rubygems'

 version = = 0

 if ARGV.first
   str = ARGV.first
   str = str.dup.force_encoding(BINARY) if str.respond_to? :force_encoding
   if str =~ /\A_(.*)_\z/
 version = $1
 ARGV.shift
   end
 end

 gem 'docsplit', version
 load Gem.bin_path('docsplit', 'docsplit', version)

 If I manage to decipher this, I'll try to translate it in Factor, and invoke
 docsplit that way.
 That should keep me busy for a while. Worth a try, though I know zip about
 ruby. Once past
 this boondoggle, I already have Factor code that walks the tree  collates
 the files.

 Thanks!
 ~cw




 On Sun, Feb 9, 2014 at 4:31 AM, John Benediktsson mrj...@gmail.com wrote:

 If you get lost in path land you can always take a break and use the
 /full/path/to/docsplit.

 On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote:

 Ah! Thanks, Joe-
 Great tip; should clear up the issue with which. I am indeed starting
 Factor in the Finder. I'll try adjusting the plist.
 Maybe that even has something to do with my docsplit puzzle. Since I can
 address commands like couchdb
 via a process, I should be able to invoke docsplit that way as well,
 even though htop shows me that docsplit
 itself spawns sub-processes, like poppler  tesseract, to do its
 extraction work. Interesting.

 I'll go study the Mac dev doc you point to,  see what I can glean from
 there.

 Back to the books,
 ~cw





 On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote:

 On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi -
 Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg,
 still Version 0.97. Same issue with Factor's which:

 IN: scratchpad USE: tools.which
 IN: scratchpad couchdb which .
 f

 IN: scratchpad python which .
 /usr/bin/python

 - The trouble appears to be with reporting my PATH properly, via getenv:

 IN: scratchpad USE: environment
 IN: scratchpad PATH os-env .
 /usr/bin:/bin:/usr/sbin:/sbin

 IN: scratchpad USE: unix.ffi
 IN: scratchpad PATH getenv .
 /usr/bin:/bin:/usr/sbin:/sbin

 IN: scratchpad \ getenv see
 USING: alien.c-types alien.syntax ;
 IN: unix.ffi
 LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ;
 inline

 - Here's my actual PATH, as seen in the terminal:

 ➜  ~ git:(master) ✗ echo $PATH

 /usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin

 - whereby which correctly finds couchdb:

 ➜  ~ git:(master) ✗ which couchdb
 /usr/local/bin/couchdb

 So, Factor's which (et al.) doesn't search beyond
 /usr/bin:/bin:/usr/sbin:/sbin.

 Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me
 a clue as to how to rectify this short-sightedness via the libc getenv.

 This is probably a side issue to my docsplit quandary (but maybe not).
 Anyone see a way to report my actual PATH to which in Factor? My PATH
 is
 augmented in my .zshrc. I don't understand why the libc function doesn't
 read it. Odd, indeed!

 If you're starting Factor from the Finder, you're not going to get a PATH
 

Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread CW Alston
Hi Alex-

Thanks, I did try

/full/path/to/docsplit text --no-clean -l chi_sim
/path/to/1_long_gu/long_gu001.pdf try-process

using both the symlink and the resolved executable:

/usr/local/opt/ruby/bin/docsplit
/usr/local/Cellar/ruby/2.1.0/bin/docsplit

but still no response, still status 0. A lightbulb went on, and I set a
duplicate symlink
in /usr/bin/docsplit (where Factor's which can find it) straight to
/usr/local/Cellar/ruby/2.1.0/bin/docsplit:

IN: scratchpad docsplit which .
/usr/bin/docsplit

-ok, but still no success with anything in io.launcher. Oy!

I see on the web that this problem calling docsplit isn't confined to
Factor. Help calls appear
in Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797
and
stackoverflow re
pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit.
Let me dig around some more; this sticky wicket
must have a workaround...

I'll dig around some more.
~cw




On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.com wrote:

 As a follow-up, from Factor you can use `with-directory-files`
 (
 http://docs.factorcode.org/content/word-with-directory-files,io.directories.html
 )
 and `absolute-path`
 (http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html)
 to get full paths to the files in some directory:

 ```
 IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ]
 with-directory-files
 /home/alex/factor/core/generic
 /home/alex/factor/core/parser
 /home/alex/factor/core/sorting
 [etc]
 ```


 On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com wrote:
  It's probably easiest to specify the full path to the file, like I did
  in my previous message.  Combined with the full path to the docsplit
  binary/link (for your particular problem), it should theoretically
  work fine:
 
  /full/path/to/docsplit text --no-clean -l chi_sim
  /path/to/1_long_gu/long_gu001.pdf try-process
 
  On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com wrote:
  Hi John-
  Beg pardon, I should have mentioned earlier that since docsplit plants a
  .txt file in the target pdf's
  directory on its own, with no other output, I had gone the route you
  suggested, but to no avail, i.e.,
 
  docsplit text --no-clean -l path run-process drop
 
  In  the terminal,  cd /path/to/1_long_gu ; docsplit text --no-clean -l
  chi_sim long_gu001.pdf
  works fine. The surprise is that, in the listener, the phrase:
 
  cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim
 long_gu001.pdf
  run-process .
 
  - returns with status 0, but leaves no file. Ditto using
  /full/path/to/docsplit in the command.
 
  The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves to
  /usr/local/Cellar/ruby/2.1.0/bin/docsplit
  (installed w/ homebrew). There I find this ruby script:
 
  require 'rubygems'
 
  version = = 0
 
  if ARGV.first
str = ARGV.first
str = str.dup.force_encoding(BINARY) if str.respond_to?
 :force_encoding
if str =~ /\A_(.*)_\z/
  version = $1
  ARGV.shift
end
  end
 
  gem 'docsplit', version
  load Gem.bin_path('docsplit', 'docsplit', version)
 
  If I manage to decipher this, I'll try to translate it in Factor, and
 invoke
  docsplit that way.
  That should keep me busy for a while. Worth a try, though I know zip
 about
  ruby. Once past
  this boondoggle, I already have Factor code that walks the tree 
 collates
  the files.
 
  Thanks!
  ~cw
 
 
 
 
  On Sun, Feb 9, 2014 at 4:31 AM, John Benediktsson mrj...@gmail.com
 wrote:
 
  If you get lost in path land you can always take a break and use the
  /full/path/to/docsplit.
 
  On Feb 9, 2014, at 2:03 AM, CW Alston cwalsto...@gmail.com wrote:
 
  Ah! Thanks, Joe-
  Great tip; should clear up the issue with which. I am indeed starting
  Factor in the Finder. I'll try adjusting the plist.
  Maybe that even has something to do with my docsplit puzzle. Since I
 can
  address commands like couchdb
  via a process, I should be able to invoke docsplit that way as well,
  even though htop shows me that docsplit
  itself spawns sub-processes, like poppler  tesseract, to do its
  extraction work. Interesting.
 
  I'll go study the Mac dev doc you point to,  see what I can glean from
  there.
 
  Back to the books,
  ~cw
 
 
 
 
 
  On Sat, Feb 8, 2014 at 10:27 PM, Joe Groff arc...@gmail.com wrote:
 
  On Sat, Feb 8, 2014 at 7:30 PM, CW Alston cwalsto...@gmail.com
 wrote:
 
  Hi -
  Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg,
  still Version 0.97. Same issue with Factor's which:
 
  IN: scratchpad USE: tools.which
  IN: scratchpad couchdb which .
  f
 
  IN: scratchpad python which .
  /usr/bin/python
 
  - The trouble appears to be with reporting my PATH properly, via
 getenv:
 
  IN: scratchpad USE: environment
  IN: scratchpad PATH os-env .
  /usr/bin:/bin:/usr/sbin:/sbin
 
  IN: scratchpad USE: unix.ffi
  IN: scratchpad PATH getenv .
  /usr/bin:/bin:/usr/sbin:/sbin
 
  IN: 

Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread Alex Vondrak
Strange.  Well, not actually strange, since many programs aren't great
about return codes...but still!  I decided to re-enact the issue by
removing /usr/local/bin (where my docsplit was installed) from my PATH,
starting Factor, and trying it out.  Looks like docsplit is dumping the txt
file in the current working directory:


IN: scratchpad docsplit which .
f
IN: scratchpad docsplit text --no-clean -l eng /tmp/thesis.pdf
run-process status .
255
IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng
/tmp/thesis.pdf run-process status .
0
IN: scratchpad /tmp/thesis.txt exists? .
f
IN: scratchpad thesis.txt exists? .
t

Seems as though you need to tell Factor to run in another working directory:

IN: scratchpad /tmp [
/usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf
run-process status .
 ] with-directory
0
IN: scratchpad /tmp/thesis.txt exists? .
t

By the way, turns out you can set the `environment` slot of an io.launcher
process, so I was thinking maybe that would help, but...

IN: scratchpad process
docsplit text --no-clean -l eng /tmp/thesis.pdf command
/tmp/stdout.txt stdout
+stdout+ stderr
{ { PATH /usr/local/bin } } environment
run-process status .
1
IN: scratchpad /tmp/stdout.txt utf8 file-contents print
sh: 1: pdftotext: not found

Damn. No dice. Looks like you'll have to fix the PATH issue on the system
itself.

Anyway, hope that helps.

(P.S.: Charles, if you're getting this message again, it's because I think
GMail might've screwed up the reply behavior and didn't send this to the
list, so I'm re-sending it.)



On Sun, Feb 9, 2014 at 3:13 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi Alex-

 Thanks, I did try

 /full/path/to/docsplit text --no-clean -l chi_sim
 /path/to/1_long_gu/long_gu001.pdf try-process

 using both the symlink and the resolved executable:

 /usr/local/opt/ruby/bin/docsplit
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit

 but still no response, still status 0. A lightbulb went on, and I set a
 duplicate symlink
 in /usr/bin/docsplit (where Factor's which can find it) straight to
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit:

 IN: scratchpad docsplit which .
 /usr/bin/docsplit

 -ok, but still no success with anything in io.launcher. Oy!

 I see on the web that this problem calling docsplit isn't confined to
 Factor. Help calls appear
 in 
 Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797 
 and
 stackoverflow re 
 pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit.
 Let me dig around some more; this sticky wicket
 must have a workaround...

 I'll dig around some more.
 ~cw




 On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.com wrote:

 As a follow-up, from Factor you can use `with-directory-files`
 (
 http://docs.factorcode.org/content/word-with-directory-files,io.directories.html
 )
 and `absolute-path`
 (http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html)
 to get full paths to the files in some directory:

 ```
 IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ]
 with-directory-files
 /home/alex/factor/core/generic
 /home/alex/factor/core/parser
 /home/alex/factor/core/sorting
 [etc]
 ```


 On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com wrote:
  It's probably easiest to specify the full path to the file, like I did
  in my previous message.  Combined with the full path to the docsplit
  binary/link (for your particular problem), it should theoretically
  work fine:
 
  /full/path/to/docsplit text --no-clean -l chi_sim
  /path/to/1_long_gu/long_gu001.pdf try-process
 
  On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com wrote:
  Hi John-
  Beg pardon, I should have mentioned earlier that since docsplit plants
 a
  .txt file in the target pdf's
  directory on its own, with no other output, I had gone the route you
  suggested, but to no avail, i.e.,
 
  docsplit text --no-clean -l path run-process drop
 
  In  the terminal,  cd /path/to/1_long_gu ; docsplit text --no-clean -l
  chi_sim long_gu001.pdf
  works fine. The surprise is that, in the listener, the phrase:
 
  cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim
 long_gu001.pdf
  run-process .
 
  - returns with status 0, but leaves no file. Ditto using
  /full/path/to/docsplit in the command.
 
  The docsplit bin alias (/usr/local/opt/ruby/bin/docsplit) resolves to
  /usr/local/Cellar/ruby/2.1.0/bin/docsplit
  (installed w/ homebrew). There I find this ruby script:
 
  require 'rubygems'
 
  version = = 0
 
  if ARGV.first
str = ARGV.first
str = str.dup.force_encoding(BINARY) if str.respond_to?
 :force_encoding
if str =~ /\A_(.*)_\z/
  version = $1
  ARGV.shift
end
  end
 
  gem 'docsplit', version
  load Gem.bin_path('docsplit', 'docsplit', version)
 
  If I manage to decipher this, I'll try to translate it in Factor, and
 invoke
  docsplit that way.
  That should keep me busy for a while. Worth a 

Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread CW Alston
Yeah, Alex-
I would have thought the cd in my compound command string would take care
of he current directory issue.
There's another thread about this
problemhttp://www.programmingrelief.com/3213645/Docsplit-Works-Fine-In-Command-Line-But-Ignores-Code-In-Ruby-Script%3Fthat
finds docsplit returning files in the root directory - on my system
no files are winding up there.
Let me see what I can do w/ your path/environment suggestions.

Gonna be another long night...
Thanks much,
~cw


On Sun, Feb 9, 2014 at 4:08 PM, Alex Vondrak ajvond...@gmail.com wrote:

 Strange.  Well, not actually strange, since many programs aren't great
 about return codes...but still!  I decided to re-enact the issue by
 removing /usr/local/bin (where my docsplit was installed) from my PATH,
 starting Factor, and trying it out.  Looks like docsplit is dumping the txt
 file in the current working directory:


 IN: scratchpad docsplit which .
 f
 IN: scratchpad docsplit text --no-clean -l eng /tmp/thesis.pdf
 run-process status .
 255
 IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng
 /tmp/thesis.pdf run-process status .
 0
 IN: scratchpad /tmp/thesis.txt exists? .
 f
 IN: scratchpad thesis.txt exists? .
 t

 Seems as though you need to tell Factor to run in another working
 directory:

 IN: scratchpad /tmp [
 /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf
 run-process status .
  ] with-directory
 0
 IN: scratchpad /tmp/thesis.txt exists? .
 t

 By the way, turns out you can set the `environment` slot of an io.launcher
 process, so I was thinking maybe that would help, but...

 IN: scratchpad process
 docsplit text --no-clean -l eng /tmp/thesis.pdf command
 /tmp/stdout.txt stdout
 +stdout+ stderr
 { { PATH /usr/local/bin } } environment
 run-process status .
 1
 IN: scratchpad /tmp/stdout.txt utf8 file-contents print
 sh: 1: pdftotext: not found

 Damn. No dice. Looks like you'll have to fix the PATH issue on the system
 itself.

 Anyway, hope that helps.

 (P.S.: Charles, if you're getting this message again, it's because I think
 GMail might've screwed up the reply behavior and didn't send this to the
 list, so I'm re-sending it.)



 On Sun, Feb 9, 2014 at 3:13 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi Alex-

 Thanks, I did try

 /full/path/to/docsplit text --no-clean -l chi_sim
 /path/to/1_long_gu/long_gu001.pdf try-process

 using both the symlink and the resolved executable:

 /usr/local/opt/ruby/bin/docsplit
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit

 but still no response, still status 0. A lightbulb went on, and I set a
 duplicate symlink
 in /usr/bin/docsplit (where Factor's which can find it) straight to
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit:

 IN: scratchpad docsplit which .
 /usr/bin/docsplit

 -ok, but still no success with anything in io.launcher. Oy!

 I see on the web that this problem calling docsplit isn't confined to
 Factor. Help calls appear
 in 
 Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797 
 and
 stackoverflow re 
 pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit.
 Let me dig around some more; this sticky wicket
 must have a workaround...

 I'll dig around some more.
 ~cw




 On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.com wrote:

 As a follow-up, from Factor you can use `with-directory-files`
 (
 http://docs.factorcode.org/content/word-with-directory-files,io.directories.html
 )
 and `absolute-path`
 (http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html
 )
 to get full paths to the files in some directory:

 ```
 IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ]
 with-directory-files
 /home/alex/factor/core/generic
 /home/alex/factor/core/parser
 /home/alex/factor/core/sorting
 [etc]
 ```


 On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com
 wrote:
  It's probably easiest to specify the full path to the file, like I did
  in my previous message.  Combined with the full path to the docsplit
  binary/link (for your particular problem), it should theoretically
  work fine:
 
  /full/path/to/docsplit text --no-clean -l chi_sim
  /path/to/1_long_gu/long_gu001.pdf try-process
 
  On Sun, Feb 9, 2014 at 1:00 PM, CW Alston cwalsto...@gmail.com
 wrote:
  Hi John-
  Beg pardon, I should have mentioned earlier that since docsplit
 plants a
  .txt file in the target pdf's
  directory on its own, with no other output, I had gone the route you
  suggested, but to no avail, i.e.,
 
  docsplit text --no-clean -l path run-process drop
 
  In  the terminal,  cd /path/to/1_long_gu ; docsplit text --no-clean -l
  chi_sim long_gu001.pdf
  works fine. The surprise is that, in the listener, the phrase:
 
  cd /path/to/1_long_gu ; docsplit text --no-clean -l chi_sim
 long_gu001.pdf
  run-process .
 
  - returns with status 0, but leaves no file. Ditto using
  /full/path/to/docsplit in the command.
 
  The docsplit bin alias 

Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread Alex Vondrak
Thing is, `cd` isn't a binary that Factor can execute in a process.  It's
just a shell command implemented by bash or zsh or whatever you use.  Same
with the semicolon syntax, for that matter.  You might try to finagle
something like

IN: scratchpad { sh -c cd /tmp ; pwd } utf8 [ contents . ]
with-process-reader
/tmp\n

Not sure how the PATH stuff will work out with that, though.

You could also try just using the `-o` flag to docsplit.  Again,
deliberately messing up my PATH so Factor can't run docsplit directly:

IN: scratchpad docsplit which .
f
IN: scratchpad /tmp/thesis.pdf exists? .
t
IN: scratchpad /tmp/thesis.txt exists? .
f
IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng
/tmp/thesis.pdf -o /tmp try-process
IN: scratchpad /tmp/thesis.txt exists? .
t



On Sun, Feb 9, 2014 at 5:02 PM, CW Alston cwalsto...@gmail.com wrote:

 Yeah, Alex-
 I would have thought the cd in my compound command string would take care
 of he current directory issue.
 There's another thread about this 
 problemhttp://www.programmingrelief.com/3213645/Docsplit-Works-Fine-In-Command-Line-But-Ignores-Code-In-Ruby-Script%3Fthat
  finds docsplit returning files in the root directory - on my system
 no files are winding up there.
 Let me see what I can do w/ your path/environment suggestions.

 Gonna be another long night...
 Thanks much,
 ~cw


 On Sun, Feb 9, 2014 at 4:08 PM, Alex Vondrak ajvond...@gmail.com wrote:

 Strange.  Well, not actually strange, since many programs aren't great
 about return codes...but still!  I decided to re-enact the issue by
 removing /usr/local/bin (where my docsplit was installed) from my PATH,
 starting Factor, and trying it out.  Looks like docsplit is dumping the txt
 file in the current working directory:


 IN: scratchpad docsplit which .
 f
 IN: scratchpad docsplit text --no-clean -l eng /tmp/thesis.pdf
 run-process status .
 255
 IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng
 /tmp/thesis.pdf run-process status .
 0
 IN: scratchpad /tmp/thesis.txt exists? .
 f
 IN: scratchpad thesis.txt exists? .
 t

 Seems as though you need to tell Factor to run in another working
 directory:

 IN: scratchpad /tmp [
 /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf
 run-process status .
  ] with-directory
 0
 IN: scratchpad /tmp/thesis.txt exists? .
 t

 By the way, turns out you can set the `environment` slot of an
 io.launcher process, so I was thinking maybe that would help, but...

 IN: scratchpad process
 docsplit text --no-clean -l eng /tmp/thesis.pdf command
 /tmp/stdout.txt stdout
 +stdout+ stderr
 { { PATH /usr/local/bin } } environment
 run-process status .
 1
 IN: scratchpad /tmp/stdout.txt utf8 file-contents print
 sh: 1: pdftotext: not found

 Damn. No dice. Looks like you'll have to fix the PATH issue on the system
 itself.

 Anyway, hope that helps.

 (P.S.: Charles, if you're getting this message again, it's because I
 think GMail might've screwed up the reply behavior and didn't send this to
 the list, so I'm re-sending it.)



 On Sun, Feb 9, 2014 at 3:13 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi Alex-

 Thanks, I did try

 /full/path/to/docsplit text --no-clean -l chi_sim
 /path/to/1_long_gu/long_gu001.pdf try-process

 using both the symlink and the resolved executable:

 /usr/local/opt/ruby/bin/docsplit
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit

 but still no response, still status 0. A lightbulb went on, and I set a
 duplicate symlink
 in /usr/bin/docsplit (where Factor's which can find it) straight to
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit:

 IN: scratchpad docsplit which .
 /usr/bin/docsplit

 -ok, but still no success with anything in io.launcher. Oy!

 I see on the web that this problem calling docsplit isn't confined to
 Factor. Help calls appear
 in 
 Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797 
 and
 stackoverflow re 
 pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit.
 Let me dig around some more; this sticky wicket
 must have a workaround...

 I'll dig around some more.
 ~cw




 On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.comwrote:

 As a follow-up, from Factor you can use `with-directory-files`
 (
 http://docs.factorcode.org/content/word-with-directory-files,io.directories.html
 )
 and `absolute-path`
 (
 http://docs.factorcode.org/content/word-absolute-path,io.pathnames.html
 )
 to get full paths to the files in some directory:

 ```
 IN: scratchpad /home/alex/factor/core [ [ absolute-path . ] each ]
 with-directory-files
 /home/alex/factor/core/generic
 /home/alex/factor/core/parser
 /home/alex/factor/core/sorting
 [etc]
 ```


 On Sun, Feb 9, 2014 at 1:53 PM, Alex Vondrak ajvond...@gmail.com
 wrote:
  It's probably easiest to specify the full path to the file, like I did
  in my previous message.  Combined with the full path to the docsplit
  binary/link (for your particular problem), it should 

Re: [Factor-talk] installation packages for CI?

2014-02-09 Thread Björn Lindqvist
2014-02-08 11:56 GMT+01:00 Gabriel Kerneis gabr...@kerneis.info:
 On Fri, Feb 07, 2014 at 06:44:32PM -0500, Andrew Pennebaker wrote:
 If we met users half way, presenting .deb's, .rpm's, maybe a ppa repo, that
 would be a great start.

 As a first step, I recommend using https://build.opensuse.org/

 It is slightly openSUSE centered, but makes it easy to check that your
 basic rpm  deb build for ubuntu, debian, fedora and openSUSE.

A while ago I created Ubuntu packages for Factor and put them in my
PPA here: https://launchpad.net/~bjourne/+archive/factor

Someone has also packaged Factor for Arch:
https://aur.archlinux.org/packages/factor/

A big problem is that Factors build doesn't make it easy to install
system-wide in a typical Linux setup. So you have to add lots of hacks
to the build to add support for prefixed installation, with binary and
support files split in different directories. It's a lot of work and
hard to keep it in sync with Factors github repository. A smaller
problem is that some Factor words wants to overwrite the image and
file and write in directories relative to the executable file which
obviously is problematic on Linux where writes outside of $HOME is
forbidden. But I think most Linux users can live with that limitation.

So to address the problematic build I've created an alternate build
process which you are welcome to check out here:
https://github.com/slavapestov/factor/pull/934 It's written using waf
which I think is great for complicated build-processes like Factor's.
With the branch, the build command becomes python waf.py configure
--prefix=/opt/factor2 build  sudo python waf.py install. Using that
as their base, I believe someone knowledgable of their distro's build
system could very easily package Factor.

Then to actually get distros to put Factor in their repos would entail
reopening tickets like this:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=471925 Plus, Debian
has some bureaucratic rules on packages they ship. Like requiring a
man page.


-- 
mvh/best regards Björn Lindqvist

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] installation packages for CI?

2014-02-09 Thread Andrew Pennebaker
I can add that ppa to my source list, but when I `apt-get install factor`,
I get a program for factoring compound numbers, not the Factor programming
language.

Could we rename the ppa to work around the name collision?


On Sun, Feb 9, 2014 at 9:59 PM, Björn Lindqvist bjou...@gmail.com wrote:

 2014-02-08 11:56 GMT+01:00 Gabriel Kerneis gabr...@kerneis.info:
  On Fri, Feb 07, 2014 at 06:44:32PM -0500, Andrew Pennebaker wrote:
  If we met users half way, presenting .deb's, .rpm's, maybe a ppa repo,
 that
  would be a great start.
 
  As a first step, I recommend using https://build.opensuse.org/
 
  It is slightly openSUSE centered, but makes it easy to check that your
  basic rpm  deb build for ubuntu, debian, fedora and openSUSE.

 A while ago I created Ubuntu packages for Factor and put them in my
 PPA here: https://launchpad.net/~bjourne/+archive/factor

 Someone has also packaged Factor for Arch:
 https://aur.archlinux.org/packages/factor/

 A big problem is that Factors build doesn't make it easy to install
 system-wide in a typical Linux setup. So you have to add lots of hacks
 to the build to add support for prefixed installation, with binary and
 support files split in different directories. It's a lot of work and
 hard to keep it in sync with Factors github repository. A smaller
 problem is that some Factor words wants to overwrite the image and
 file and write in directories relative to the executable file which
 obviously is problematic on Linux where writes outside of $HOME is
 forbidden. But I think most Linux users can live with that limitation.

 So to address the problematic build I've created an alternate build
 process which you are welcome to check out here:
 https://github.com/slavapestov/factor/pull/934 It's written using waf
 which I think is great for complicated build-processes like Factor's.
 With the branch, the build command becomes python waf.py configure
 --prefix=/opt/factor2 build  sudo python waf.py install. Using that
 as their base, I believe someone knowledgable of their distro's build
 system could very easily package Factor.

 Then to actually get distros to put Factor in their repos would entail
 reopening tickets like this:
 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=471925 Plus, Debian
 has some bureaucratic rules on packages they ship. Like requiring a
 man page.


 --
 mvh/best regards Björn Lindqvist


 --
 Managing the Performance of Cloud-Based Applications
 Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
 Read the Whitepaper.

 http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
 ___
 Factor-talk mailing list
 Factor-talk@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/factor-talk




-- 
Cheers,

Andrew Pennebaker
www.yellosoft.us
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk___
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk


Re: [Factor-talk] OCR via docsplit in Factor

2014-02-09 Thread CW Alston
Lord love a duck, Alex - I didn't realize that builtins like `cd` are
'existentially' different than utilities like `cat` -
(I only speak pidgin unix; bites me often). Thanks for the heads-up.

Okay... I'll try moving|copying my target directory into my home folder, to
obviate the need for any cd'ing (I hope),
 pass docsplit an array of pdfs and flags; or maybe have docsplit iterate
over a tmp file containing lines like:

chi_sim long_gu001.pdf
eng long_gu002.pdf
eng long_gu003.pdf ...

Probably have to do this in a script. Never a dull moment.
~cw



On Sun, Feb 9, 2014 at 6:34 PM, Alex Vondrak ajvond...@gmail.com wrote:

 Thing is, `cd` isn't a binary that Factor can execute in a process.  It's
 just a shell command implemented by bash or zsh or whatever you use.  Same
 with the semicolon syntax, for that matter.  You might try to finagle
 something like

 IN: scratchpad { sh -c cd /tmp ; pwd } utf8 [ contents . ]
 with-process-reader
 /tmp\n

 Not sure how the PATH stuff will work out with that, though.

 You could also try just using the `-o` flag to docsplit.  Again,
 deliberately messing up my PATH so Factor can't run docsplit directly:


 IN: scratchpad docsplit which .
 f
 IN: scratchpad /tmp/thesis.pdf exists? .

 t
 IN: scratchpad /tmp/thesis.txt exists? .
 f
 IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng
 /tmp/thesis.pdf -o /tmp try-process

 IN: scratchpad /tmp/thesis.txt exists? .
 t



 On Sun, Feb 9, 2014 at 5:02 PM, CW Alston cwalsto...@gmail.com wrote:

 Yeah, Alex-
 I would have thought the cd in my compound command string would take care
 of he current directory issue.
 There's another thread about this 
 problemhttp://www.programmingrelief.com/3213645/Docsplit-Works-Fine-In-Command-Line-But-Ignores-Code-In-Ruby-Script%3Fthat
  finds docsplit returning files in the root directory - on my system
 no files are winding up there.
 Let me see what I can do w/ your path/environment suggestions.

 Gonna be another long night...
 Thanks much,
 ~cw


 On Sun, Feb 9, 2014 at 4:08 PM, Alex Vondrak ajvond...@gmail.com wrote:

 Strange.  Well, not actually strange, since many programs aren't great
 about return codes...but still!  I decided to re-enact the issue by
 removing /usr/local/bin (where my docsplit was installed) from my PATH,
 starting Factor, and trying it out.  Looks like docsplit is dumping the txt
 file in the current working directory:


 IN: scratchpad docsplit which .
 f
 IN: scratchpad docsplit text --no-clean -l eng /tmp/thesis.pdf
 run-process status .
 255
 IN: scratchpad /usr/local/bin/docsplit text --no-clean -l eng
 /tmp/thesis.pdf run-process status .
 0
 IN: scratchpad /tmp/thesis.txt exists? .
 f
 IN: scratchpad thesis.txt exists? .
 t

 Seems as though you need to tell Factor to run in another working
 directory:

 IN: scratchpad /tmp [
 /usr/local/bin/docsplit text --no-clean -l eng /tmp/thesis.pdf
 run-process status .
  ] with-directory
 0
 IN: scratchpad /tmp/thesis.txt exists? .
 t

 By the way, turns out you can set the `environment` slot of an
 io.launcher process, so I was thinking maybe that would help, but...

 IN: scratchpad process
 docsplit text --no-clean -l eng /tmp/thesis.pdf command
 /tmp/stdout.txt stdout
 +stdout+ stderr
 { { PATH /usr/local/bin } } environment
 run-process status .
 1
 IN: scratchpad /tmp/stdout.txt utf8 file-contents print
 sh: 1: pdftotext: not found

 Damn. No dice. Looks like you'll have to fix the PATH issue on the
 system itself.

 Anyway, hope that helps.

 (P.S.: Charles, if you're getting this message again, it's because I
 think GMail might've screwed up the reply behavior and didn't send this to
 the list, so I'm re-sending it.)



 On Sun, Feb 9, 2014 at 3:13 PM, CW Alston cwalsto...@gmail.com wrote:

 Hi Alex-

 Thanks, I did try

 /full/path/to/docsplit text --no-clean -l chi_sim
 /path/to/1_long_gu/long_gu001.pdf try-process

 using both the symlink and the resolved executable:

 /usr/local/opt/ruby/bin/docsplit
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit

 but still no response, still status 0. A lightbulb went on, and I set a
 duplicate symlink
 in /usr/bin/docsplit (where Factor's which can find it) straight to
 /usr/local/Cellar/ruby/2.1.0/bin/docsplit:

 IN: scratchpad docsplit which .
 /usr/bin/docsplit

 -ok, but still no success with anything in io.launcher. Oy!

 I see on the web that this problem calling docsplit isn't confined to
 Factor. Help calls appear
 in 
 Plone-Usershttp://sourceforge.net/mailarchive/message.php?msg_id=29982797
  and
 stackoverflow re 
 pythonhttp://stackoverflow.com/questions/18237442/execute-shell-commands-in-python-to-use-docsplit.
 Let me dig around some more; this sticky wicket
 must have a workaround...

 I'll dig around some more.
 ~cw




 On Sun, Feb 9, 2014 at 2:16 PM, Alex Vondrak ajvond...@gmail.comwrote:

 As a follow-up, from Factor you can use `with-directory-files`
 (