hello Robin hello all 

@ Randal - i have read your complains and i guess that you think that i want to 
do some harmeful. 

i do not want to do any thing harmefull. I am working on my PhD and i nedd to 
have (collect some more data ) 
i work in  so what - 

i work in the filed of social  resarch - and escpecially the fild of  
online-research see - 
 http://opensource.mit.edu/online_papers.php
 http://opensource.mit.edu/

my current investigation includes some analysis of discussions - online 
discussions

first of - i have to explain something; I have to grab some data out  of a 
phpBB in order to do some field reseach. I need the data out of  a forum that 
is runned by a user community. I need the data  to analyze the discussions.

to give an example - let us take this forum here. How can i grab all the data 
out of this  forum - and get it local and then after wards put it in a local 
database - of a phpBB-forum -  is this possible"?!"?  to give an example - let 
us take this forum here - am i able to grabb and harvest data out of
  this forum here. How can i do that.  

What i have in mind - Nothing harmeful - nothing bad - nothing serious and 
dangerous.
 But the issue is. i have to get the data - so what?


I need to to take out forum messages and other data (foum topics, users) into 
database. 
Purpose: create forum copy for text analysis. Does anyone have approximate 
solution?

It is needed to get data through HTTP for further analysis - in need to get the 
data through
 HTTP and put it into CSV - in order to get a dump that can fill a local 
database of a phpBB-board. 

I need the data in a allmost full and complete formate. So i need all the data 
like

username .-
forum
thread
topic
text of the posting and so on and so on.

see http://www.phpbbdoctor.com/doc_tables.php for a full overview: 


how to do that? i need some kind of a grabbing tool - can i do it with that 
kind of tool. How do i sove the storing-issue 
into the local mysql-database. Well you see that is a tricky work - and i am 
pretty sure taht i am getting help here. So for any  and all help i am very 
very thankful many many thanks in advance

And now Robin - Randal, please i am willing to discuss the implications that 
come with my ideas, my wish - 
but believe me.


I could run my investigations with a browser - as well _- i could load 700 
threads - THEY ARE ONLINE SO WHATS the 
the difficult. 

 EVERYThing is online - i do not really understand the difference here... but i 
am open to the discussion with you 


look forward to hear your ideas , suggestions and - yes after the legal /(and 
ethical discussions ) i am looking forward 
to a technical discussion 

 jobst 

- a Ethno-reseracher












> -----Ursprüngliche Nachricht-----
> Von: Robin Norwood <[EMAIL PROTECTED]>
> Gesendet: 26.08.06 16:18:07
> An: merlyn@stonehenge.com (Randal L. Schwartz)
> CC: beginners@perl.org
> Betreff: Re: subroutine in LWP - in order to get 700 forum threads


> merlyn@stonehenge.com (Randal L. Schwartz) writes:
> 
> >>>>>> "jobst" == jobst müller <[EMAIL PROTECTED]> writes:
> >
> > jobst> to admit - i am a Perl-novice and ihave not so much experience in
> > jobst> perl. But i am willing to learn. i want to learn perl. As for now i
> > jobst> have to solve some tasks for the college. I have to do some
> > jobst> investigations on a board where i have no access to the db.
> >
> > If you don't have access to the database, what makes you think you have
> > permission to run a robot against the web API?
> >
> > Where are the ethics trainings when we need it?  Sheesh.
> >
> > In other words, to make it very clear:
> >
> >         DO NOT ATTEMPT TO DO THIS
> 
> Really?  If I understood the OP correctly, all he wants to do is 'screen
> scrape' the (public) board in question.  In other words, nothing
> significantly different from what Google does when it indexes.  I don't
> really see an ethical (as opposed to legal - IANAL!) problem with that.
> Of course, I would first email the admin for permission, and make *sure*
> that such a bot is 'well behaved' - such as adding calls to sleep inside
> some of those loops.  After he gets the data, he could do something
> unethical with it - like republish it.  But just getting the data
> doesn't seem wrong to me.
> 
> As I said above, I am not a lawyer!  The above should not be taken to
> mean I think it is legal to do this.  But it does sound ethical to me.
> 
> -RN
> 
> -- 
> Robin Norwood
> Red Hat, Inc.
> 
> "The Sage does nothing, yet nothing remains undone."
> -Lao Tzu, Te Tao Ching
> 
> -- 
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> <http://learn.perl.org/> <http://learn.perl.org/first-response>
> 
> 













> -----Ursprüngliche Nachricht-----
> Von: merlyn@stonehenge.com (Randal L. Schwartz)
> Gesendet: 26.08.06 16:19:44
> An: Robin Norwood <[EMAIL PROTECTED]>
> CC: beginners@perl.org
> Betreff: Re: subroutine in LWP - in order to get 700 forum threads


> >>>>> "Robin" == Robin Norwood <[EMAIL PROTECTED]> writes:
> 
> >> DO NOT ATTEMPT TO DO THIS
> 
> Robin> Really?  If I understood the OP correctly, all he wants to do is 
> 'screen
> Robin> scrape' the (public) board in question.  In other words, nothing
> Robin> significantly different from what Google does when it indexes.  I don't
> Robin> really see an ethical (as opposed to legal - IANAL!) problem with that.
> Robin> Of course, I would first email the admin for permission, and make 
> *sure*
> Robin> that such a bot is 'well behaved' - such as adding calls to sleep 
> inside
> Robin> some of those loops.  After he gets the data, he could do something
> Robin> unethical with it - like republish it.  But just getting the data
> Robin> doesn't seem wrong to me.
> 
> It's one thing to be google, and index all the pages for public use.
> 
> It's entirely another to do it for your own personal gain (knowledge
> or commerce, doesn't matter).
> 
> If you can't see the difference, you need to retune your ethics.
> 
> -- 
> Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
> <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
> See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
> 
> -- 
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> <http://learn.perl.org/> <http://learn.perl.org/first-response>
> 
> 


_____________________________________________________________________
Der WEB.DE SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
http://smartsurfer.web.de/?mc=100071&distributionid=000000000066


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to