[R-lang] Re: Relationship between individual speaker mean and priming in conversational speech data

Meredith Tamminga Sat, 25 Jan 2014 20:05:48 -0800

Hi Florian,

Ah, yeah turning it around to comprehension-to-production priming would
definitely do the trick! I hadn't thought of that. Unfortunately right now
I don't have that data because I haven't coded the interlocutor speech, so
I only have production-to-production data (and I exclude observations where
the target should be considered "primed" by some other interlocutor).
Here's what I do have: 122 interviews from the Philadelphia Neighborhood
Corpus, with /ing/-vs-/in/ coded for all types of ING (not just verbs, also
nouns etc). There's an average of 39 tokens for each speaker, total N=4803,
but some speakers have a lot more data than other. So for example if I
exclude speakers with fewer than 50 tokens, there are 45 speakers with a
mean of 83 tokens each.


Let me know if there's any other info I can provide or if you'd like me to
send the data your way. I'm starting to think that even with what I
originally thought was a lot of data (ditched Buckeye as insufficient a
while ago!), it's still not going to be possible to do what I'm trying to
do. In which case I may have to table the question until I get the
interlocutor data coded to try out your suggestion...

Thanks!
Meredith




On Sat, Jan 25, 2014 at 10:28 PM, T. Florian Jaeger <ti...@csli.stanford.edu
> wrote:

> Hi Meredith,
>
> ha, i love that question (yours). I recall that in the preparation of
> Jaeger & Snider 2013-Cognition, we looked into whether we could answer that
> question in the Switchboard, but there were too few speakers with
> sufficiently many sentences. probably something like Buckeye would be the
> corpus to use. And using -ing as you suggest probably gives you much more
> data than the ditransitives we had. There's some preliminary evidence in
> favor of a positive answer to your question. In Jaeger 2010 -CogPsych, I
> discuss evidence from complement clause priming. I point to the finding
> that complementizer presence primes more strongly in conversational speech
> data (where it's rare) but less strongly in sentence production studies
> (where it's the majority choice). If you search that paper for "syntactic
> priming", you'll find this in the middle of my other ramblings.
>
> Ok, but back to your question.What type of data do you have? If you can
> solidly estimate the production bias of a speaker (ideally for a variety of
> a verbs), e.g., when not talking to anyone or when talking to interlocutor
> A, you should be able to use that to predict how surprising primes during
> *comprehension* are when the same speaker listens to interlocutor B. By
> checking how much the speaker then deviates from her own baseline after a
> comprehension prime, you should be able to tease things apart, no (i.e., in
> this case, you'd be looking at comprehension-to-production priming)? This
> might make sense anyway -- in Jaeger & Snider 13 it seemed like the
> surprisal effects might originate in comprehension, though we discuss
> alternative interpretations, too (see Section 5.4). Are you wondering
> exclusively about production-to-production priming?
>
> Florian
>
>
> On Fri, Jan 24, 2014 at 9:57 AM, Meredith Tamminga <
> tammi...@babel.ling.upenn.edu> wrote:
>
>> Hi Florian,
>>
>> Thanks for the response. I'm reassured that you think a random by-speaker
>> intercept will filter out the speaker baseline usage effect, since I've
>> been using that approach in other portions of this project. I've done a bit
>> of simulation testing this out and it does seem to be fine, at least for
>> the distributions of between-speaker differences I'm working with.
>>
>> But, I'm still not sure how to then separate out and identify any 
>> *true*effect of the speaker baseline rate on persistence. In Jaeger & Snider 
>> 2007
>> you ask the very interesting question "probable given what?" about how
>> surprisal affects the strength of the persistence effect. What I really
>> want to ask here is whether there is an effect of an individual speaker's
>> habitual usage in terms of whether /in/ or /ing/ seems more surprising, or
>> whether the speakers generally share an evaluation where, say, /in/
>> provokes a larger persistence effect because it is nonstandard. In other
>> words, could one aspect of the surprisal effect on persistence be "probable
>> given the speaker's own preferences"? The problem, then, is to disentangle
>> the uninteresting effect (that speakers with more extreme baselines have
>> more apparent clustering as a matter of course) from the potential effect
>> of interest (an effect where different speaker baselines produce greater or
>> lesser degrees of persistence *beyond* the amount expected if there were
>> no priming at work).
>>
>> I'm not aware of any other work in this area that has really grappled
>> with this issue yet -- since it's a question that I don't believe has been
>> asked. Your work on the preference of individual verbs for certain
>> constructions doesn't face this problem, I don't think, because the prime
>> and target aren't constrained to be drawn from the same distribution.
>>
>> One possibility I've been considering is some sort of transformation of
>> the speaker baseline, so that in simulated data generated through a set of
>> binomial trials with the weights of the empirical speaker baselines there
>> would be no apparent relationship between baseline and amount of
>> clustering. I haven't worked out quite what transformation that should be,
>> and I'm not entirely convinced it's a good approach.
>>
>> Any further thoughts are much appreciated!
>>
>> Meredith
>>
>>
>>
>>
>> On Wed, Jan 22, 2014 at 11:53 PM, T. Florian Jaeger <
>> ti...@csli.stanford.edu> wrote:
>>
>>> Hi Meredith,
>>>
>>> I think that a mixed model with a random by-speaker intercept should
>>> successfully filter out the speaker baseline usage effect (though I haven't
>>> simulated how well this works, depending on the distribution of
>>> between-speaker differences). This approach is used in e.g., Jaeger (2006,
>>> 2010) and Jaeger & Snider (2013, Study 1). A slightly different approach is
>>> presented by Reitter (2006 and follow up) who focuses on the idea of
>>> distance-based decay of priming (rather than a priming main effect). There
>>> are variety of alternative approaches discussed in the literature. For
>>> example, researchers from Frank Keller's lab have used a first half vs.
>>> second half of conversation comparison. I forgot the reference to the
>>> original proposal they cite, but if memory serves right Dubbey et al. 2010
>>> discuss and use this approach.
>>>
>>> You might also want into Gries 2005 and Szmrecsanyi 2005 though I don't
>>> recall whether they in any way controlled for the baseline issue.
>>>
>>> HTH & sorry if I misrepresented anyone's work,
>>>
>>> Florian
>>>
>>> Gries, S. T. (2005). Syntactic priming: a corpus-based approach. Journal
>>> of Psycholinguistic Research, 34(4), 365–99. doi:10.1007/s10936-005-6139-3
>>> Jaeger, T. F., & Snider, N. E. (2013). Alignment as a consequence of
>>> expectation adaptation: Syntactic priming is affected by the prime’s
>>> prediction error given both prior and recent experience. Cognition, 127(1),
>>> 57–83. doi:10.1016/j.cognition.2012.10.013
>>> Reitter, D., Moore, J. D., & Keller, F. (2006). Priming of Syntactic
>>> Rules in Task-Oriented Dialogue and Spontaneous Conversation. In
>>> Proceedings of the 28th Annual Conference of the Cognitive Science Society
>>> (pp. 1–6).Sturt, Patrick, Frank Keller, and Amit Dubey. "Syntactic
>>> priming in comprehension: Parallelism effects with and without
>>> coordination." *Journal of Memory and Language* 62.4 (2010): 333-351.
>>> Szmrecsanyi, B. (2005). Language users as creatures of habit: A
>>> corpus-based analysis of persistence in spoken English. Corpus Linguistics
>>> and Linguistic Theory, 1(1), 113–150. doi:10.1515/cllt.2005.1.1.113
>>>
>>>
>>>
>>> On Tue, Jan 21, 2014 at 11:54 AM, Meredith Tamminga <
>>> tammi...@babel.ling.upenn.edu> wrote:
>>>
>>>> Hello R-lang,
>>>>
>>>> I have a dataset consisting of observations of the binary variable ING
>>>> (e.g. workin' vs working) from conversational speech. I am interested in
>>>> the priming effect on this variable: to what extent does the most recent
>>>> prior observation (the prime) affect the outcome of the current observation
>>>> (the target)? To assess this, I am using the dependent variable of "rep" --
>>>> does the target have the same value (/in/ or /ing/) as the prime? I intend
>>>> to use this as the response variable for logistic regression. Available
>>>> predictors are prime.var (is the prime a token of /ing/ or /in/?),
>>>> target.var (is the prime a token of /ing/ or /in/?), same.word (are prime
>>>> and target the same lexical item?), same.gram (do prime and target have the
>>>> same grammatical status?), log.lag (log2 of the distance in seconds between
>>>> the prime and the target), and spkr.mean (the mean rate of /ing/ use by the
>>>> speaker who produced the token). There are 4879 observations (prime/target
>>>> pairs) from 90 different speakers who have very different mean rates of
>>>> /ing/ use. The prime and the target are always from the same speaker.
>>>>
>>>> So here's the problem: I can't figure out how to account for the fact
>>>> that repetition of the variant is more likely as a matter of course (rather
>>>> than as a matter of priming) when speakers have rates of /ing/ use near 0%
>>>> or 100%. I am particularly interested in testing the hypothesis of an
>>>> interaction between prime.var and spkr.mean: that for speakers who have low
>>>> /in/ rates, /in/ is a stronger prime than /ing/, whereas for speakers who
>>>> have low /ing/ rates, /ing/ is a stronger prime than /in/. If I include
>>>> prime.var * spkr.mean in the model, though, the effect I am looking for is
>>>> obscured by the trivial fact that speakers who use /ing/ a lot will
>>>> naturally be likely to have another /ing/ target after an /ing/ prime, and
>>>> speakers who use /in/ a lot will naturally be likely to have an /in/ target
>>>> after an /in/ prime. What I'm trying to figure out is whether there is an
>>>> interaction of prime.var and spkr.mean *beyond* what is expected just given
>>>> that the prime and target share a bias.
>>>>
>>>> If anyone can make any suggestions for how to proceed here it would be
>>>> much appreciated. Please let me know if there's anything I can clarify or
>>>> add.
>>>>
>>>> Thanks!
>>>> Meredith
>>>>
>>>
>>>
>>
>

[R-lang] Re: Relationship between individual speaker mean and priming in conversational speech data

Reply via email to