I built a system in Rails 2.3.8 that accepted PDF uploads and needed
to extract their text content using the venerable (read ancient)
pdftotext command-line utility. I had to jump through the following
hoops to make it work, and this might have some bearing on your
solution:
#model
has_attached_file :pdf,:styles => { :text => { :fake =>
'variable' } }, :processors => [:text]
after_post_process :extract_text
private
def extract_text
file = File.open("#{pdf.queued_for_write[:text].path}","r")
plain_text = ""
while (line = file.gets)
plain_text << Iconv.conv('ASCII//IGNORE', 'UTF8', line)
end
self.plain_text = plain_text
end
#lib/paperclip_processors/text.rb
module Paperclip
# Handles extracting plain text from PDF file attachments
class Text < Processor
attr_accessor :whiny
# Creates a Text extract from PDF
def make
src = @file
dst = Tempfile.new([...@basename, 'txt'].compact.join("."))
command = <<-end_command
"#{ File.expand_path(src.path) }"
"#{ File.expand_path(dst.path) }"
end_command
begin
success = Paperclip.run("/usr/bin/pdftotext -nopgbrk",
command.gsub(/\s+/, " "))
Rails.logger.info "Processing #{src.path} to #{dst.path} in
the text processor."
rescue PaperclipCommandLineError
raise PaperclipError, "There was an error processing the text
for #...@basename}" if @whiny
end
dst
end
end
end
Within the environs of Paperclip, you can write processors that do
pretty much anything, and usually result in a new file saved as a new
format in the attachments hierarchy. Once that process is done, you
can access the result file and do other stuff with it. But I'm not
sure if that answers your question at all, since you don't seem to be
facing the same problem I was.
If your form posts a file to Paperclip, you don't get access to the
file parts of that form submission directly in your controller, unless
I'm missing something fundamental. But a processor can access them
directly, at a very low level.
Walter
On Oct 12, 2010, at 12:01 AM, Christian Fazzini wrote:
Walter Lee, I am trying to pass a value (stream_type) that was in
submitted in my form to :styles using proc.
Radhames, even if I do all this in the model, without a processor, I
am still not able to get the value of stream_type, that was passed
when the form tried to submit.
So Philip, this means I would need to reprocess the file and delete
the original? This means I will be processing the file twice? Isn't
this a waste of resources if I only want one processed file that is
adherent to the stream_type?
On Oct 12, 12:07 am, Philip Hallstrom <[email protected]> wrote:
has_attached_file gets read when the class file is first read. It
then sets up the various paperclip methods that do there stuff.
When it does that, 'instance' is a new/blank record.
What you have below will work (if memory serves me right) if you
save the record, reload it, and then reprocess it.
Something along those lines... it's been awhile since I ran into
this.
-philip
On Oct 10, 2010, at 1:22 PM, Christian Fazzini wrote:
anyone?? Been on this for days without a solution.... thinking
that I
may have to switch to another gem just to get this feature to
work...
On Oct 7, 3:16 pm, Christian Fazzini <[email protected]>
wrote:
I have a stream_type field on my form. When the form submits,
instance.stream_type is blank. To verify this, in my custom
processor
(class ProcessAudio < Processor), I do puts options[:geometry].
has_attached_file :media,
:styles => { :original => Proc.new { |instance|
instance.stream_type } },
:url => '/assets/artists/:artist_id/
songs/:id/:style.:extension',
:path => ':rails_root/public/assets/
artists/:artist_id/songs/:id/:style.:extension',
:processors => [:process_audio]
If I provide a fixed string. For example:
has_attached_file :media,
:styles => { :original => '30' },
:url => '/assets/artists/:artist_id/
songs/:id/:style.:extension',
:path => ':rails_root/public/assets/
artists/:artist_id/songs/:id/:style.:extension',
:processors => [:process_audio]
puts options[:geometry] = 30. Why does is it NOT work with proc?
What is wrong?
--
You received this message because you are subscribed to the Google
Groups "Ruby on Rails: Talk" group.
To post to this group, send email to [email protected]
.
To unsubscribe from this group, send email to [email protected]
.
For more options, visit this group athttp://groups.google.com/group/rubyonrails-talk?hl=en
.
--
You received this message because you are subscribed to the Google
Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-
[email protected].
To unsubscribe from this group, send email to [email protected]
.
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en
.
--
You received this message because you are subscribed to the Google Groups "Ruby on
Rails: Talk" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en.